flowchart LR S[Symmetrical<br/>Mean = Median = Mode] --- P[Positively Skewed<br/>Mean > Median > Mode<br/>Tail on the right] S --- N[Negatively Skewed<br/>Mean < Median < Mode<br/>Tail on the left] style S fill:#E8F5E9,stroke:#2E7D32 style P fill:#FFF8E1,stroke:#F9A825 style N fill:#FFEBEE,stroke:#C62828
39 Measures of Skewness
39.1 What is Skewness?
Skewness is the degree of asymmetry of a frequency distribution about its central value. The mean tells us the centre; dispersion tells us the spread; skewness tells us whether the distribution leans to one side. Kurtosis, treated together with skewness in this topic, captures the peakedness of the distribution (gupta2021?; elhance2020?).
Three working ideas:
- A symmetrical distribution has the same shape on both sides of the central value.
- A skewed distribution has a longer tail on one side.
- The direction of skew is named after the direction of the longer tail.
39.2 Symmetric vs Skewed Distributions
| Case | Tail | Position of Mean, Median, Mode |
|---|---|---|
| Symmetrical | Equal on both sides | Mean = Median = Mode |
| Positively skewed (right-skewed) | Longer tail on the right (high values) | Mean > Median > Mode |
| Negatively skewed (left-skewed) | Longer tail on the left (low values) | Mean < Median < Mode |
A useful mnemonic: the mean is pulled toward the longer tail. In a positively skewed distribution, the few large values pull the mean up above the median and mode; in a negatively skewed distribution, the few small values pull the mean down below the median and mode.
39.3 Tests of Skewness
A distribution is skewed if any of the following hold:
- Mean ≠ Median ≠ Mode.
- The two halves on either side of the median are unequal in shape.
- Quartile distances: \(Q_3 - M_d \neq M_d - Q_1\).
- The frequency curve drawn from the data is not symmetrical about the central vertical.
39.4 Measures of Skewness
Skewness is measured in absolute form (raw difference) or relative form (coefficient). The relative form is preferred because it is unit-free and lies in a known interval.
| Coefficient | Formula | Range |
|---|---|---|
| Karl Pearson’s first | \(Sk_p = \dfrac{\bar X - M_o}{\sigma}\) | \(-1\) to \(+1\) generally; theoretically \(-3\) to \(+3\) |
| Karl Pearson’s second (when mode is ill-defined) | \(Sk_p = \dfrac{3(\bar X - M_d)}{\sigma}\) | \(-3\) to \(+3\) |
| Bowley’s (quartile-based) | \(Sk_b = \dfrac{Q_3 + Q_1 - 2 M_d}{Q_3 - Q_1}\) | \(-1\) to \(+1\) |
| Kelly’s (percentile-based) | \(Sk_k = \dfrac{P_{90} + P_{10} - 2 M_d}{P_{90} - P_{10}}\) | \(-1\) to \(+1\) |
A fifth approach uses the moments:
\[ \beta_1 = \dfrac{\mu_3^2}{\mu_2^3}, \qquad \gamma_1 = \sqrt{\beta_1} = \dfrac{\mu_3}{\sigma^3} \]
where \(\mu_2, \mu_3\) are the second and third central moments. \(\gamma_1\) keeps the sign and is the standard moment-based coefficient of skewness.
39.5 Karl Pearson’s Coefficient of Skewness
The first form uses mode; the second uses median when mode is hard to identify. In a moderately skewed distribution, the empirical relation \(\text{Mode} \approx 3 M_d - 2 \bar X\) is used to derive the second form from the first.
39.5.1 Worked example
A distribution has mean 50, median 45 and standard deviation 10. Karl Pearson’s second coefficient:
\[ Sk_p = \dfrac{3(50 - 45)}{10} = \dfrac{15}{10} = 1.5 \]
Since the value is positive, the distribution is positively skewed; the mean exceeds the median.
39.6 Bowley’s Coefficient of Skewness
Bowley’s measure is based on quartiles only and does not require mean, mode or standard deviation. It is therefore suitable for open-ended distributions where the extremes are not clearly defined.
The numerator compares the distance from the median to \(Q_3\) against the distance from the median to \(Q_1\):
- If \((Q_3 - M_d) = (M_d - Q_1)\) → numerator zero → symmetric.
- If \((Q_3 - M_d) > (M_d - Q_1)\) → positive skew.
- If \((Q_3 - M_d) < (M_d - Q_1)\) → negative skew.
The denominator normalises by the inter-quartile range, yielding a coefficient that always lies between \(-1\) and \(+1\).
39.6.1 Worked example
If \(Q_1 = 40\), \(M_d = 50\), \(Q_3 = 80\):
\[ Sk_b = \dfrac{80 + 40 - 2 \times 50}{80 - 40} = \dfrac{20}{40} = 0.5 \]
Positive skew of moderate magnitude.
39.7 Kelly’s Coefficient
Kelly extended Bowley’s idea to wider percentiles — using \(P_{10}\) and \(P_{90}\) — to capture asymmetry in the tails rather than only in the central 50 per cent. The interpretation is identical: positive value → positive skew; negative → negative skew; zero → symmetric.
39.8 Moment-Based Coefficients
The \(\gamma_1\) coefficient is the standard in modern statistics. For a normal distribution, \(\gamma_1 = 0\); positive skew gives \(\gamma_1 > 0\); negative skew gives \(\gamma_1 < 0\). Many statistical software packages report \(\gamma_1\) along with \(\gamma_2\) (kurtosis).
39.9 Kurtosis
Kurtosis measures the peakedness of a distribution — how much of the variability comes from rare, extreme deviations. It complements skewness: skewness is about asymmetry; kurtosis is about tail heaviness and peakedness.
| Case | Description | \(\beta_2\) | \(\gamma_2\) |
|---|---|---|---|
| Mesokurtic | Same as normal distribution | \(\beta_2 = 3\) | \(\gamma_2 = 0\) |
| Leptokurtic | More peaked, heavier tails than normal | \(\beta_2 > 3\) | \(\gamma_2 > 0\) |
| Platykurtic | Flatter, lighter tails than normal | \(\beta_2 < 3\) | \(\gamma_2 < 0\) |
The moment formulas are:
\[ \beta_2 = \dfrac{\mu_4}{\mu_2^2}, \qquad \gamma_2 = \beta_2 - 3 \]
The convention \(\gamma_2 = \beta_2 - 3\) centres the normal benchmark at zero — software output usually reports excess kurtosis (i.e. \(\gamma_2\)).
flowchart LR L[Leptokurtic<br/>β₂ > 3<br/>peaked, heavy tails] --- M[Mesokurtic<br/>β₂ = 3<br/>normal benchmark] M --- P[Platykurtic<br/>β₂ < 3<br/>flat, light tails] style L fill:#FCE4EC,stroke:#AD1457 style M fill:#E8F5E9,stroke:#2E7D32 style P fill:#E3F2FD,stroke:#1565C0
39.10 Why Skewness and Kurtosis Matter
| Use | Working content |
|---|---|
| Test the appropriateness of mean | If skewed, the median is more representative |
| Test the normality assumption | Many parametric tests assume normality (skew = 0, \(\gamma_2 = 0\)) |
| Identify risk in finance | Heavy tails (high \(\gamma_2\)) flag fat-tail risk; negative skew flags downside risk |
| Compare shapes of distributions | Beyond mean and SD |
| Diagnose quality control | Process drift produces skew or excess kurtosis |
39.11 Comparison — Skewness Coefficients
| Measure | Inputs needed | Best for | Range |
|---|---|---|---|
| Karl Pearson 1st | Mean, Mode, SD | Unimodal data with clear mode | \(-3\) to \(+3\) |
| Karl Pearson 2nd | Mean, Median, SD | When mode is ill-defined | \(-3\) to \(+3\) |
| Bowley | \(Q_1, Q_2, Q_3\) | Open-ended classes | \(-1\) to \(+1\) |
| Kelly | \(P_{10}, P_{50}, P_{90}\) | When tails matter | \(-1\) to \(+1\) |
| Moment (\(\gamma_1\)) | All moments | Software computation | Real line |
39.12 Exam-Pattern MCQs
View solution
| Measure | Inputs | ||
| (i) | Karl Pearson 1st | (a) | $Q_1, Q_2, Q_3$ |
| (ii) | Karl Pearson 2nd | (b) | Mean, Median, SD |
| (iii) | Bowley | (c) | Mean, Mode, SD |
| (iv) | Kelly | (d) | $P_{10}, P_{50}, P_{90}$ |
View solution
View solution
View solution
| Case | Description | ||
| (i) | Mesokurtic | (a) | More peaked, heavier tails than normal |
| (ii) | Leptokurtic | (b) | Same as normal — $\beta_2 = 3$ |
| (iii) | Platykurtic | (c) | Flatter, lighter tails than normal |
View solution
View solution
View solution
| Statistic | Captures | ||
| (i) | Mean | (a) | Asymmetry |
| (ii) | Standard deviation | (b) | Centre |
| (iii) | Skewness | (c) | Peakedness and tail-heaviness |
| (iv) | Kurtosis | (d) | Spread |
View solution
- Skewness = degree of asymmetry about the central value. Kurtosis = peakedness / tail-heaviness.
- Symmetric: Mean = Median = Mode. Positive skew (right-skewed): Mean > Median > Mode. Negative skew (left-skewed): Mean < Median < Mode.
- The mean is pulled toward the longer tail.
- Karl Pearson 1st: \(Sk_p = (\bar X - M_o) / \sigma\). Karl Pearson 2nd: \(Sk_p = 3(\bar X - M_d)/\sigma\). Range: \(\pm 3\).
- Bowley’s: \(Sk_b = (Q_3 + Q_1 - 2 M_d)/(Q_3 - Q_1)\). Range: \(\pm 1\). Suitable for open-ended classes.
- Kelly’s: uses \(P_{10}, P_{50}, P_{90}\).
- Moment: \(\gamma_1 = \mu_3 / \sigma^3\).
- Kurtosis: mesokurtic (\(\beta_2 = 3\), normal), leptokurtic (\(\beta_2 > 3\), heavy tails), platykurtic (\(\beta_2 < 3\), light tails). Excess kurtosis \(\gamma_2 = \beta_2 - 3\).
- Mean → centre; SD → spread; Skewness → asymmetry; Kurtosis → tails / peakedness.