flowchart TB
CT[Central Tendency] --> MA[Mathematical]
CT --> PO[Positional]
MA --> AM[Arithmetic Mean]
MA --> GM[Geometric Mean]
MA --> HM[Harmonic Mean]
PO --> M[Median]
PO --> MO[Mode]
PO --> Q[Quartiles, Deciles, Percentiles]
classDef default fill:#003366,color:#ffffff,stroke:#ffcc00,stroke-width:3px,rx:10px,ry:10px;
38 Measures of central tendency
38.1 Concept of Central Tendency
A measure of central tendency is “a single value that attempts to describe a set of data by identifying the central position within that data” (Croxton & Cowden). It is the typical or representative value around which observations cluster. Three classical measures are mathematical — Arithmetic Mean, Geometric Mean, Harmonic Mean — and two are positional — Median and Mode. Each measure has its specific assumptions, properties, and use-cases. A good measure should be: easy to understand, rigidly defined, based on all observations, capable of further algebraic treatment, not unduly affected by extreme values, and possible to determine even for open-end classes.
38.2 Arithmetic Mean (AM)
The AM is the most commonly used measure: sum of observations divided by the number of observations.
| Type of data | Formula |
|---|---|
| Individual series | \(\bar{X} = \frac{\sum X_i}{N}\) |
| Discrete (frequency) | \(\bar{X} = \frac{\sum f_i X_i}{\sum f_i}\) |
| Continuous (class intervals) | \(\bar{X} = \frac{\sum f_i m_i}{\sum f_i}\) where \(m_i\) = mid-point |
| Weighted | \(\bar{X}_w = \frac{\sum w_i X_i}{\sum w_i}\) |
38.2.1 Properties of AM
- Sum of deviations from the mean is zero: \(\sum (X - \bar{X}) = 0\).
- Sum of squared deviations from the mean is minimum.
- Affected by every observation — including extreme values.
- Algebraic combination: AM of pooled groups computable from group means and sizes.
- Most rigorously defined; widely used in further statistical analysis.
38.2.2 Limitations of AM
- Severely affected by extreme values (outliers).
- Cannot be computed for open-end class intervals.
- Not suitable for qualitative data.
- Can be misleading — “average Indian family has 1.8 children”.
38.3 Median
The median is the middle value when observations are arranged in ascending or descending order. It divides the data into two equal halves.
| Data | Formula |
|---|---|
| Ungrouped, odd N | Middle value, i.e., \((N+1)/2\)-th |
| Ungrouped, even N | Average of \(N/2\)-th and \((N/2 + 1)\)-th |
| Continuous | \(Median = L + \frac{N/2 - cf}{f} \times h\) where L = lower limit of median class, cf = cumulative freq before, f = freq of median class, h = class width |
- Positional measure — depends on rank, not magnitude.
- Not affected by extreme values — robust.
- Can be computed for open-end intervals.
- Suitable for qualitative ordinal data (rankings).
- Limitation: not amenable to further algebraic treatment.
38.4 Quartiles, Deciles, Percentiles
These are positional measures that generalise the median: - Quartiles (Q₁, Q₂, Q₃) divide data into 4 equal parts; Q₂ = Median. - Deciles (D₁ … D₉) divide into 10 equal parts. - Percentiles (P₁ … P₉₉) divide into 100 equal parts.
38.5 Mode
The mode is the value that occurs most frequently — derived from French à la mode (“fashionable”).
| Data | Working |
|---|---|
| Ungrouped | Inspect; the most frequent value |
| Continuous | \(Mode = L + \frac{f_1 - f_0}{2f_1 - f_0 - f_2} \times h\) where L = lower limit of modal class, f₁ = modal-class freq, f₀ = preceding, f₂ = succeeding, h = width |
- Not affected by extreme values.
- Can be calculated for qualitative data.
- May be non-existent or non-unique (bimodal, multimodal).
- Suitable for categorical and nominal data.
38.6 Empirical Relationship (Karl Pearson)
For moderately skewed distributions:
\[Mode = 3 \times Median - 2 \times Mean\]
This is Karl Pearson’s empirical relationship.
38.7 Geometric Mean (GM)
The GM is the n-th root of the product of n observations:
\[GM = \sqrt[n]{X_1 \cdot X_2 \cdot \ldots \cdot X_n} = \left(\prod_{i=1}^{n} X_i\right)^{1/n}\]
Logarithmic form for computation: \[\log GM = \frac{\sum \log X_i}{N}\]
Use cases: averaging ratios, rates, index numbers, compound growth rates (CAGR).
38.7.1 CAGR
\[CAGR = \left(\frac{\text{End Value}}{\text{Begin Value}}\right)^{1/n} - 1\]
38.8 Harmonic Mean (HM)
The HM is the reciprocal of the AM of reciprocals:
\[HM = \frac{N}{\sum (1/X_i)}\]
Use cases: averaging rates (speeds for equal distances), prices for equal expenditures.
38.9 Relationship — AM ≥ GM ≥ HM
For any set of positive observations: \[AM \geq GM \geq HM\]
with equality only when all observations are equal.
Also: \(GM^2 = AM \times HM\).
38.10 Choice of Measure
| Situation | Best measure |
|---|---|
| Symmetric, no outliers, further computation needed | AM |
| Skewed distribution or outliers | Median |
| Categorical or modal-value question | Mode |
| Averaging growth rates / ratios | GM |
| Averaging rates over equal distances | HM |
| Open-end class intervals | Median or Mode |
| Quick eyeball estimate | Mode |
PYQ trap: Empirical relation is Mode = 3 Median − 2 Mean (Karl Pearson). Don’t reverse the multipliers.
38.11 Practice Questions
The arithmetic mean of 5, 7, 9, 11, 13 is:
View solution
The median of 3, 5, 7, 8, 11, 15, 18 is:
View solution
The mode of 2, 3, 3, 5, 5, 5, 6, 8 is:
View solution
Karl Pearson's empirical relation is:
View solution
The sum of deviations of observations from the AM is:
View solution
Which measure is **least affected** by extreme values?
View solution
Compound annual growth rate of an asset growing from ₹100 to ₹121 in 2 years is:
View solution
A traveller covers equal distances at 40 kmph and 60 kmph. The average speed is best measured by:
View solution
For any set of positive observations:
View solution
The relationship between GM, AM, and HM is:
View solution
Which is a **positional** measure of central tendency?
View solution
For a data set with two values of equal highest frequency, the distribution is:
View solution
Q₂ (the second quartile) is the same as:
View solution
For a perfectly symmetric distribution:
View solution
For a *positively* (right) skewed distribution:
View solution
For averaging *index numbers* and *growth rates*, the best measure is:
View solution
Two groups have AMs 30 and 50 with 40 and 60 items. Combined AM:
View solution
For open-end class intervals, the most appropriate measure is:
View solution
Best measure for **qualitative / categorical** data (e.g., favourite colour):
View solution
Match each measure with its best use-case:
| Measure | Use-case | ||
| (i) | AM | (a) | Averaging growth rates / index numbers |
| (ii) | GM | (b) | Skewed distribution; outliers present |
| (iii) | Median | (c) | Symmetric distribution; further computation |
| (iv) | HM | (d) | Speeds over equal distances |
View solution
38.12 Quick Recall
- Central tendency = single representative value (Croxton & Cowden).
- Mathematical: AM, GM, HM; Positional: Median, Mode (+ quartiles, deciles, percentiles).
- AM = Σ X/N; properties — Σ(X − X̄) = 0; minimises Σ(X − X̄)²; sensitive to outliers.
- Median — middle value; positional; not affected by outliers; usable for open-end / ordinal data.
- Mode — most frequent; suitable for categorical/nominal.
- Karl Pearson’s empirical relation: Mode = 3 Median − 2 Mean (moderately skewed).
- GM — for ratios, growth rates, CAGR; log form for computation.
- HM — for speeds / rates over equal distance/expenditure.
- Inequality: AM ≥ GM ≥ HM, equality when all equal. GM² = AM × HM.
- Skew: Right (positive) → Mean > Median > Mode; Left → reverse; Symmetric → all equal.