📊 Measures of Central Tendency –

Understanding data begins with a single question: Where is the data centered?
That’s where averages or measures of central tendency come in.

📘 What Are Averages?

According to Prof. Bowley, averages are:

“Statistical constants which enable us to comprehend in a single effort the significance of the whole.”

In simpler words, an average is a single value that represents an entire data distribution.

🎯 Why Are Averages Important?

Averages help:

Summarize large data sets
Identify trends
Make comparisons
Serve as a foundation for further statistical analysis

📌 Common Measures of Central Tendency

There are five widely used averages:

Arithmetic Mean (Simply called "Mean")
Median
Mode
Geometric Mean
Harmonic Mean

Let’s understand each with examples and Python code.

✅ Requisites of an Ideal Measure of Central Tendency

According to Prof. Yule, a good average must:

Be rigidly defined
Be easy to understand and compute
Use all observations
Be suitable for mathematical treatment
Be minimally affected by sampling fluctuations

Additionally, a good measure:
6. Should not be overly influenced by extreme values (outliers)

🔢 1. Arithmetic Mean (AM)

The most commonly used average.

Formula:

\bar{x} = \frac{x_1 + x_2 + \cdots + x_n}{n}

Python Example:


import statistics

data = [10, 20, 30, 40, 50]
mean = statistics.mean(data)
print("Mean:", mean)

Output:


Mean: 30

For frequency data:

\bar{x} = \frac{\sum f_i x_i}{\sum f_i}

🔢 2. Median

The middle value when the data is sorted.

If n is odd: Median = middle value
If n is even: Median = average of two middle values

Python Example:


data = [10, 20, 30, 40, 50]
median = statistics.median(data)
print("Median:", median)

Output:


Median: 30

🔢 3. Mode

The value that occurs most frequently in the data.

Python Example:


data = [10, 20, 20, 30, 40]
mode = statistics.mode(data)
print("Mode:", mode)

Output:


Mode: 20

🔢 4. Geometric Mean (GM)

Used for multiplicative processes (e.g., growth rates, financial data).

Formula:

GM = \sqrt[n]{x_1 x_2 \cdots x_n}

Python Example:


import math
from statistics import geometric_mean

data = [2, 4, 8]
gm = geometric_mean(data)
print("Geometric Mean:", gm)

Output:


Geometric Mean: 4.0

🔢 5. Harmonic Mean (HM)

Useful when dealing with rates, like speed or density.

Formula:

HM = \frac{n}{\sum \frac{1}{x_i}}

Python Example:


from statistics import harmonic_mean

data = [2, 4, 4]
hm = harmonic_mean(data)
print("Harmonic Mean:", hm)

Output:


Harmonic Mean: 3.0

📊 Summary Table

Measure	Best Used For	Sensitive to Outliers
Arithmetic Mean	General numeric data	✅ Yes
Median	Skewed distributions	❌ No
Mode	Categorical / repeated values	❌ No
Geometric Mean	Percentages, ratios, growth	✅ Yes
Harmonic Mean	Rates (e.g., speed, price/unit)	✅ Yes

🧠 Final Thoughts

Understanding these five measures gives you the power to:

Interpret datasets meaningfully
Compare distributions
Perform deeper statistical analyses

Start with the mean, consider the median for skewed data, and apply mode, GM, and HM when the context calls for it.

📌 Example 2.1(a) – Ungrouped Frequency Distribution

We are given:

$x$	1	2	3	4	5	6	7
$f$	5	9	12	17	14	10	6

🧮 Formula for Arithmetic Mean:

$\bar{x} = \frac{\sum f_i x_i}{\sum f_i}$

✍️ Step-by-Step Calculation:

Let’s calculate $f_i x_i$ for each:

$x_i$	$f_i$	$f_i x_i$
1	5	5
2	9	18
3	12	36
4	17	68
5	14	70
6	10	60
7	6	42
Total	73	299

$\bar{x} = \frac{299}{73} \approx 4.096$

✅ Final Answer:

Mean ≈ 4.096

🐍 Python Code:


x = [1, 2, 3, 4, 5, 6, 7]
f = [5, 9, 12, 17, 14, 10, 6]

total_fx = sum([f[i] * x[i] for i in range(len(x))])
total_f = sum(f)

mean = total_fx / total_f
print("Mean:", round(mean, 3))

📌 Example 2.1(b) – Grouped Frequency Distribution

We are given:

Marks	0-10	10-20	20-30	30-40	40-50	50-60
Students	12	18	27	20	17	6

Step 1: Find class midpoints ( $x_i$ )

$x_i = \frac{\text{Lower Limit} + \text{Upper Limit}}{2}$

Class	Frequency ( $f_i$	Midpoint ( $x_i$ )	$f_i x_i$
0–10	12	5	60
10–20	18	15	270
20–30	27	25	675
30–40	20	35	700
40–50	17	45	765
50–60	6	55	330
Total	100		2800

🧮 Arithmetic Mean:

$\bar{x} = \frac{\sum f_i x_i}{\sum f_i} = \frac{2800}{100} = 28.0$

✅ Final Answer:

Mean = 28.0

🐍 Python Code:


class_intervals = [(0, 10), (10, 20), (20, 30), (30, 40), (40, 50), (50, 60)]
frequencies = [12, 18, 27, 20, 17, 6]

# Calculate midpoints
midpoints = [(low + high) / 2 for low, high in class_intervals]

# Compute mean
total_fx = sum([frequencies[i] * midpoints[i] for i in range(len(midpoints))])
total_f = sum(frequencies)

mean = total_fx / total_f
print("Mean:", mean)

Assumed Mean Method

When calculating the arithmetic mean directly using:

\bar{x} = \frac{\sum f_i x_i}{\sum f_i}

It may involve heavy multiplication if $x_i$ and $f_i$ are large.

To simplify the arithmetic, we use deviations from an assumed mean $A$ :

✅ Assumed Mean Method Formula

Let:

$d_i = x_i - A$

Then,

$\bar{x} = A + \frac{\sum f_i d_i}{\sum f_i}$

Where:

$A$ = assumed mean (a value close to most $x_i$
$d_i = x_i - A$
$f_i$ = frequency

🧮 Derivation:

Given:

$d_i = x_i - A \Rightarrow x_i = d_i + A$

Then:

$\sum f_i x_i = \sum f_i (d_i + A) = \sum f_i d_i + A \sum f_i$

So,

$\bar{x} = \frac{\sum f_i x_i}{\sum f_i} = \frac{\sum f_i d_i + A \sum f_i}{\sum f_i} = \frac{\sum f_i d_i}{\sum f_i} + A = A + \frac{\sum f_i d_i}{\sum f_i}$

📌 Example:

Let’s take the same data from Example 2.1(a):

$x_i$	1	2	3	4	5	6	7
$f_i$	5	9	12	17	14	10	6

Let’s take assumed mean $A = 4$ (the middle value).

Then compute $d_i = x_i - A$ and $f_i d_i$ :

$x_i$	$f_i$	$d_i = x_i - 4$	$f_i d_i$
1	5	-3	-15
2	9	-2	-18
3	12	-1	-12
4	17	0	0
5	14	+1	+14
6	10	+2	+20
7	6	+3	+18
Total	73		+7

Now apply:

$\bar{x} = A + \frac{\sum f_i d_i}{\sum f_i} = 4 + \frac{7}{73} \approx 4.096$

✅ Same result, simpler multiplication.

Step-Deviation Method for calculating the Arithmetic Mean from a grouped (or continuous) frequency distribution—a very efficient shortcut when class intervals are equal.

🔹 Step-Deviation Method (for Grouped Data)

When:

The data is in class intervals (e.g., 0–10, 10–20, etc.)
Each class has a uniform width $h$

We use the step-deviation method to simplify calculations further than the assumed mean method.

✅ Step-by-step Formula:

Let:

$A$ = assumed mean (choose a class near the center of the distribution)
$x_i$ = mid-point of each class
$d_i = \frac{x_i - A}{h}$
$f_i$ = frequency of each class
$h$ = common class width
$N = \sum f_i$ = total frequency

Then the arithmetic mean is:

$\begin{matrix} \overset{ˉ}{x} = A + h \cdot \frac{\sum f_{i} d_{i}}{\sum f_{i}} \end{matrix}$

📌 Example

Let’s use the data from your earlier example:

Marks (Class Interval)	$f_i$
0–10	12
10–20	18
20–30	27
30–40	20
40–50	17
50–60	6

Find midpoints $x_i$ of each class:

Class	$f_i$	$x_i$
0–10	12	5
10–20	18	15
20–30	27	25
30–40	20	35
40–50	17	45
50–60	6	55

Choose assumed mean: Let $A = 25$ , and $h = 10$ (since all intervals are of width 10)
Compute step-deviations $d_i = \frac{x_i - A}{h}$

$x_i$	$f_i$	$d_i$	$f_i d_i$
5	12	-2	-24
15	18	-1	-18
25	27	0	0
35	20	+1	+20
45	17	+2	+34
55	6	+3	+18
	N=100		+30

Apply the formula:

$\bar{x} = A + h \cdot \frac{\sum f_i d_i}{\sum f_i} = 25 + 10 \cdot \frac{30}{100} = 25 + 3 = \boxed{28}$

✅ Summary

Advantages of Step-Deviation Method:

Greatly reduces computation
Especially helpful in exams and large datasets
Only valid when class width $h$ is uniform

Let's go through Example 2.2 step by step, verifying and explaining the Step-Deviation Method calculation of the mean for the given frequency distribution.

📊 Given Data:

Class Interval	Mid-value $x$	Frequency $f$	$d = \frac{x - A}{h}$	$f \cdot d$
0–8	4	8	-3	-24
8–16	12	7	-2	-14
16–24	20	16	-1	-16
24–32	28	24	0	0
32–40	36	15	1	15
40–48	44	7	2	14
		$N = 77$		$\sum fd = -25$

Constants:

Assumed mean $A = 28$
Class width $h = 8$

✅ Step-Deviation Mean Formula:

\bar{x} = A + h \cdot \frac{\sum f d}{\sum f}

\bar{x} = 28 + 8 \cdot \left(\frac{-25}{77}\right)

\bar{x} = 28 - 2.597 = \boxed{25.4039} \approx \boxed{25.404}

✅ Final Answer:

\boxed{\bar{x} = 25.404}

📘 Properties of Arithmetic Mean

The arithmetic mean (commonly called the average) is one of the most commonly used measures of central tendency. Apart from being easy to compute, it also possesses several important mathematical properties that make it useful in statistical analysis.

✅ Property 1: Sum of Deviations from the Mean is Zero

If we take a list of numbers $x_1, x_2, ..., x_n$ with arithmetic mean $\bar{x}$ , then:

\sum_{i=1}^{n}(x_i - \bar{x}) = 0

👉 This means that the total distance of all values above the mean is exactly balanced by the total distance of all values below the mean.

🔎 Example:

If the values are: 3, 5, 7

Mean $\bar{x} = \frac{3 + 5 + 7}{3} = 5$
Deviations: $(3 - 5) + (5 - 5) + (7 - 5) = -2 + 0 + 2 = 0$

✅ Property 2: Minimum Sum of Squared Deviations

Among all possible values from which deviations could be measured, the mean gives the minimum sum of squared deviations.

Mathematically, for any constant $a$ :

\sum_{i=1}^{n}(x_i - \bar{x})^2 \leq \sum_{i=1}^{n}(x_i - a)^2

This property is important in least squares estimation, where we try to minimize the squared error — hence, the mean is preferred.

✅ Property 3: Mean of a Composite Series

If you have multiple groups of data, each with its own mean and number of values, you can find the mean of the combined data (composite mean) using:

Let:

$\bar{x}_1, \bar{x}_2, ..., \bar{x}_k$ be the means of $k$ groups
$n_1, n_2, ..., n_k$ be the sizes of those groups

Then the mean of the combined (composite) data is:

\bar{x} = \frac{n_1 \bar{x}_1 + n_2 \bar{x}_2 + \cdots + n_k \bar{x}_k}{n_1 + n_2 + \cdots + n_k}

🔎 Example:

Group A:

10 students, average marks = 60
Group B:
20 students, average marks = 70

Composite mean:

\bar{x} = \frac{10 \times 60 + 20 \times 70}{10 + 20} = \frac{600 + 1400}{30} = \frac{2000}{30} = 66.67

So, the combined average is 66.67.

Example :The average salary of male employees in a firm was Rs. 520, and that of female employees was Rs. 420. The mean salary of all the employees was Rs. 500. Find the percentage of male and female employees in the firm.

Given:

Average salary of males, $x_{1} = 520$
Average salary of females, $x_2 = 420$
Overall average salary, $\bar{x} = 500$

Let:

$n_1$ = number of male employees
$n_2$ = number of female employees

We use the composite mean formula:

$\bar{x} = \frac{n_1 x_1 + n_2 x_2}{n_1 + n_2}$

Substituting the known values:

$500 = \frac{520n_1 + 420n_2}{n_1 + n_2}$

Multiply both sides by $n_1 + n_2$ :

$500(n_1 + n_2) = 520n_1 + 420n_2$

Expand both sides:

$500n_1 + 500n_2 = 520n_1 + 420n_2$

Rearrange the terms:

$520n_1 - 500n_1 = 500n_2 - 420n_2 \Rightarrow 20n_1 = 80n_2$

Divide both sides by 20:

$n_1 = 4n_2 \Rightarrow \frac{n_1}{n_2} = \frac{4}{1}$

So, the ratio of males to females = 4 : 1

✅ Percentage Calculation

Total parts = 4 (males) + 1 (females) = 5

Percentage of male employees:

$\frac{4}{5} \times 100 = 80\%$

Percentage of female employees:

$\frac{1}{5} \times 100 = 20\%$

🎯 Final Answer:

Male employees = 80%
Female employees = 20%

Merits and Demerits of Arithmetic Mean

✅ Merits of Arithmetic Mean

Rigorously Defined:
It is mathematically well-defined and has a precise meaning.
Simple to Understand and Compute:
Arithmetic mean is easy to grasp and quick to calculate, either manually or with software.
Based on All Observations:
Every value in the dataset contributes to the computation, making it comprehensive.
Algebraically Manipulable:
It allows algebraic treatment. For example, the mean of a composite series can be calculated using:
$\bar{x} = \frac{\sum_{i=1}^{k} n_i \bar{x}_i}{\sum_{i=1}^{k} n_i}$
where $\bar{x}_i$ are the means and $n_i$ the sizes of $k$ component series.
Least Affected by Sampling Fluctuations:
Among all averages, the arithmetic mean is the most stable and consistent across samples.
Ideal Average (as per Prof. Yule):
It fulfills the theoretical criteria for an ideal average.

❌ Demerits of Arithmetic Mean

Cannot Be Found by Inspection or Graphically:
Unlike the mode or median, the mean cannot be located visually.
Not Suitable for Qualitative Data:
It cannot be used for non-quantitative characteristics like honesty, beauty, or intelligence.
Sensitive to Missing or Illegible Values:
A single missing or invalid value can prevent the computation unless omitted.
Affected by Extreme Values (Outliers):
A few extremely high or low values can distort the mean, making it non-representative.
Can Lead to Misleading Conclusions Without Context:
Example:
- Student A scores: 50%, 60%, 70%
- Student B scores: 70%, 60%, 50%
  Both have an average of 60%, but A shows improvement while B deteriorates.
Not Suitable for Open-End Class Intervals:
If the data has open classes (e.g., "above 90"), the mean can't be accurately computed.
Unsuitable for Highly Skewed Distributions:
In heavily asymmetric data, the mean may not reflect the central tendency properly—median is preferred.

Weighted Mean

In the calculation of the arithmetic mean, we usually assume that all items carry equal importance. However, in real-world situations, some items are more significant than others, and their relative importance should be factored into the calculation. This is where the weighted mean becomes essential.

❓ Why Use a Weighted Mean?

The simple mean treats all items equally.
But in many practical situations (e.g., cost of living, exam marks), different items have different significance or "weights".
Example: While calculating the change in cost of living, essential items like rice or wheat must be given more weight compared to non-essentials like cigarettes or confectionery.

🧮 Formula for Weighted Mean

Let:

$X_i$ be the values of the items (e.g., prices, scores),
$W_i$ be the weights (importance) assigned to each item.

Then the Weighted Mean is:

\bar{X}_w = \frac{\sum W_i X_i}{\sum W_i}

This is similar to the formula for the simple mean, with weights $W_i$ replacing frequencies $f_i$ .

📌 Key Observations

If all weights are equal, the weighted mean = simple mean.
If larger weights are given to larger values, the weighted mean > simple mean.
If smaller weights are given to larger values, the weighted mean < simple mean.

✅ Use Cases of Weighted Mean

Calculating average grades (where different subjects have different credit weights).
Measuring cost of living index (where items like rent, food, transport have different importance).
Financial portfolio returns (where each asset has a different investment weight).

Find the simple and weighted arithmetic mean of the first $n$ natural numbers, the weights being the corresponding numbers.

Find the simple and weighted arithmetic mean of the first n natural numbers, the weights being the corresponding numbers

Solution:

Let the first $n$ natural numbers be:

1, 2, 3, \dots, n

🔹 Simple Arithmetic Mean (A.M.):

The formula for the sum of the first $n$ natural numbers is:

\sum X = 1 + 2 + 3 + \dots + n = \frac{n(n+1)}{2}

So, the simple arithmetic mean is:

\bar{X} = \frac{\sum X}{n} = \frac{1 + 2 + 3 + \dots + n}{n} = \frac{\frac{n(n+1)}{2}}{n} = \frac{n+1}{2}

🔹 Weighted Arithmetic Mean:

Here, weights $W_i$ are equal to the values $X_i$ themselves.

So:

$W_i = X_i$
$W_i X_i = X_i^2$

We need:

\bar{X}_w = \frac{\sum W_i X_i}{\sum W_i} = \frac{\sum X_i^2}{\sum X_i}

We use the formulas:

$\sum X_i = 1 + 2 + \dots + n = \frac{n(n+1)}{2}$
$\sum X_i^2 = 1^2 + 2^2 + \dots + n^2 = \frac{n(n+1)(2n+1)}{6}$

Substituting:

\bar{X}_w = \frac{\frac{n(n+1)(2n+1)}{6}}{\frac{n(n+1)}{2}} = \frac{(2n+1)}{3}

✅ Final Answer:

Simple Arithmetic Mean = $\frac{n+1}{2}$

Weighted Arithmetic Mean = $\frac{2 n + 1}{3}$

Median

The median of a distribution is the value that divides it into two equal parts. That is:

Half the observations lie below the median.
Half lie above the median.

Hence, median is a positional average (not affected much by extreme values).

🔹 1. Ungrouped Data (Raw Data)

Odd number of observations:
Median = the middle value after sorting the data.
Even number of observations:
Median = average of the two middle values.

📌 Example:

Data: 25, 20, 15, 35, 18

Sorted: 15, 18, 20, 25, 35 → Median = 20

Data: 8, 20, 50, 25, 15, 30

Sorted: 8, 15, 20, 25, 30, 50
Median =

\frac{20 + 25}{2} = 22.5

📝 Remark: For even-numbered datasets, any value between the two middle values can technically be used as the median, but by convention, we use their average.

✅ Median Formula (for Grouped/Continuous Frequency Data):

$\text{Median} = l + \left( \frac{\frac{N}{2} - F}{f} \right) \cdot h$

Where:

Symbol	Meaning
$l$	Lower boundary of the median class
$N$	Total frequency
$F$	Cumulative frequency before the median class
$f$	Frequency of the median class
$h$	Width (class size) of the median class

✍️ Interpretation:

$\frac{N}{2}$ tells you where the median lies in the cumulative frequency table.
Find the class where this value falls → that’s the median class.
Plug the values into the formula to get the median.

🧮 Example:

Find the median wage of the following distribution:

Wages (in Rs.) : 20-30 30-40 40-50 50--60 60-70

No. of labours : 3 5 20 10 5

🧮 Given:

Wages (in Rs.)	Frequency (f)
20–30	3
30–40	5
40–50	20
50–60	10
60–70	5

➕ Step 1: Find cumulative frequencies (cf)

Wages (in Rs.)	Frequency (f)	Cumulative Frequency (cf)
20–30	3	3
30–40	5	8
40–50	20	28
50–60	10	38
60–70	5	43

🔍 Step 2: Identify median class

Total number of labourers: $N = 43$
$\frac{N}{2} = \frac{43}{2} = 21.5$

Find the class whose cumulative frequency just exceeds 21.5 → it is 40–50, with cf = 28.

So, the median class is: 40–50

🔢 Step 3: Apply the Median formula

\text{Median} = l + \left( \frac{\frac{N}{2} - F}{f} \right) \cdot h

Where:

$l = 40$ (lower limit of median class)
$N = 43$
$F = 8 (F before median class)$
$f = 20 (frequency of median class)$
$h = 10$ (class width)

\text{Median} = 40 + \left( \frac{21.5 - 8}{20} \right) \cdot 10 = 40 + \left( \frac{13.5}{20} \right) \cdot 10

\text{Median} = 40 + 6.75 = \boxed{46.75}

✅ Final Answer:

Median wage = Rs. 46.75

✅ Merits of Median

Rigorously Defined:
Median is clearly and unambiguously defined. It has a specific position in the dataset.
Easy to Understand and Calculate:
Especially with sorted data, the median can often be found simply by inspection.
Unaffected by Extreme Values (Outliers):
Unlike the mean, the median is not influenced by unusually high or low values.
Applicable to Open-Ended Distributions:
Median can be computed even when the distribution has open-ended intervals like "below 10" or "above 100".

❌ Demerits of Median

Not Exact for Even Number of Observations:
For even-sized datasets, the median is estimated as the average of the two middle values, which may not reflect an actual data point.
Ignores Most Data Points:
Median only considers the middle position(s); values far from the center do not affect it. For example:
- Median of {10, 25, 50, 60, 65} is 50.
- Even if 10 and 25 are changed to 1 and 20 or 60 and 65 are changed to 70 and 80, the median remains 50.
Not Suitable for Algebraic Treatment:
Median does not lend itself to further statistical operations like mean does (e.g., finding combined medians is not straightforward).
Affected by Sampling Fluctuations:
Median can vary significantly between samples compared to the mean when samples are small or variable.

📘 Uses of Median

For Qualitative Data:
Useful when data is ranked but not measurable (e.g., intelligence levels, honesty ratings).
In Income and Wealth Distribution:
Commonly used to represent central tendency when dealing with wages or wealth, where data is often skewed.

Mode

Mode is the value in a dataset that occurs most frequently. It represents the most typical or common value around which other values tend to cluster.

🔍 Examples of Mode in Real Life

The average height of an Indian male is 5'-6"
→ This refers to the most common height, i.e., mode.
The average shoe size sold in a shop is 7
→ Shoe size 7 is sold most frequently → Mode = 7
An average student spends Rs. 150 per month in a hostel
→ Rs. 150 is the most commonly occurring monthly expenditure → Mode = Rs. 150

📊 Example: Discrete Frequency Distribution

x (Value)	1	2	3	4	5	6	7	8
f (Freq.)	4	9	16	25	22	15	7	3

Here, the maximum frequency is 25, which corresponds to x = 4.
So, Mode = 4

⚠️ Special Cases Where Mode is Not Easily Identified

Repeated Maximum Frequencies
- If more than one value has the same highest frequency, the distribution is bimodal or multimodal.
Maximum Frequency at the Beginning or End
- If the highest frequency is in the first or last class, mode may not give a good central value.
Irregular Frequency Distribution

If the data fluctuates significantly or has no clear peak, mode may be misleading or undefined.

📌 Mode Formula for Continuous Frequency Distribution

\text{Mode} = l + \left( \frac{f_1 - f_0}{2f_1 - f_0 - f_2} \right) \times h

🧩 Where:

$l$ = lower boundary of the modal class
$h$ = class width (class interval size)
$f_1$ = frequency of the modal class
$f_0$ = frequency of the class before modal class
$f_2$ = frequency of the class after modal class

✅ Steps to Find the Mode

Identify the modal class (class with the highest frequency).
Plug values into the formula:
$\text{Mode} = l + \left( \frac{f_1 - f_0}{2f_1 - f_0 - f_2} \right) \times h$

📊 Example:

Class Interval	Frequency
10 - 20	5
20 - 30	8
30 - 40	12
40 - 50	20 ← Modal Class (highest frequency)
50 - 60	10
60 - 70	5

Here:

Modal class = 40–50
$l = 40$
$h = 10$
$f_1 = 20$
$f_0 = 12$
$f_2 = 10$

🧮 Substitute in formula:

\text{Mode} = 40 + \left( \frac{20 - 12}{2 \times 20 - 12 - 10} \right) \times 10 = 40 + \left( \frac{8}{40 - 12 - 10} \right) \times 10 = 40 + \left( \frac{8}{18} \right) \times 10 = 40 + 4.44 = 44.44

🎯 Final Answer:

Mode = 44.44

📊Example:

\text{Mode} = l + \left( \frac{f_1 - f_0}{2f_1 - f_0 - f_2} \right) \times h

Class Interval	Frequency (f)
0 – 10	5
10 – 20	8
20 – 30	7
30 – 40	12
40 – 50	28 ← Modal class (highest frequency)
50 – 60	20
60 – 70	10
70 – 80	10

Value (xᵢ)	Frequency (fᵢ)
2	3
4	5
5	2

Mode	Distance (km)	Speed (km/h)
Train	900	60
Boat	3000	25
Plane	400	350
Taxi	15	25

Segment	Distance $S_i$	Speed $V_i$	Time $\frac{S_i}{V_i}$
Train	900	60	15.00 hrs
Boat	3000	25	120.00 hrs
Plane	400	350	1.14 hrs (approx)
Taxi	15	25	0.60 hrs

Partition Values

Partition values are the values that divide a series (or dataset) into equal parts.

Types of Partition Values

Quartiles

Quartiles divide the data into four equal parts:
- Q₁ (First Quartile): 25% of observations lie below it, and 75% lie above.
- Q₂ (Second Quartile): It is the Median; 50% of observations lie below and 50% above.
- Q₃ (Third Quartile): 75% of observations lie below it, and 25% lie above.
Deciles

Deciles divide the data into ten equal parts:
- Notation: D₁, D₂, ..., D₉
- Example: D₇ (Seventh Decile) means 70% of the observations lie below it, and 30% above.
Percentiles

Percentiles divide the data into 100 equal parts:
- Notation: P₁, P₂, ..., P₉₉
- Example: P₄₇ (47th Percentile) is the value below which 47% of the observations lie.

Note on Calculation

The methods used to calculate quartiles, deciles, and percentiles are similar to that used for calculating the median, whether the distribution is:

Discrete (list of values with frequencies), or
Continuous (grouped frequency distribution).

Example

Eight coins were tossed together, and the number of heads resulting from each toss was recorded. This experiment was repeated 256 times. The following frequency distribution table shows how many times each possible number of heads (from 0 to 8) occurred:

Number of Heads (x)	0	1	2	3	4	5	6	7	8
Frequency (f)	1	9	26	59	72	52	29	7	1

Tasks:

Calculate the following statistical measures based on the data provided:

Median
First Quartile (Q₁)
Third Quartile (Q₃)
Fourth Decile (D₄)
27th Percentile (P₂₇)

Given Data:

Number of Heads (x)	0	1	2	3	4	5	6	7	8
Frequency (f)	1	9	26	59	72	52	29	7	1
Cumulative Frequency (cf)	1	10	36	95	167	219	248	255	256

Total number of observations (N) = 256

1. Median

$\frac{N}{2} = \frac{256}{2} = 128$
The cumulative frequency just greater than 128 is 167, which corresponds to x = 4.

Median = 4

2. First Quartile (Q₁)

$\frac{N}{4} = \frac{256}{4} = 64$
The cumulative frequency just greater than 64 is 95, which corresponds to x = 3.

Q₁ = 3

3. Third Quartile (Q₃)

$\frac{3N}{4} = \frac{3 \times 256}{4} = 192$
The cumulative frequency just greater than 192 is 219, which corresponds to x = 5.

Q₃ = 5

4. 4th Decile (D₄)

$D_4 = \frac{4N}{10} = \frac{4 \times 256}{10} = 102.4$
The cumulative frequency just greater than 102.4 is 167, which corresponds to x = 4.

D₄ = 4

5. 27th Percentile (P₂₇)

$P_{27} = \frac{27N}{100} = \frac{27 \times 256}{100} = 69.12$
The cumulative frequency just greater than 69.12 is 95, which corresponds to x = 3.

P₂₇ = 3