46 Hypothesis Testing
46.1 What is a Hypothesis Test?
A hypothesis test is a formal statistical procedure for using sample data to decide between two competing claims about a population parameter (gupta2021?; kothari2019?). The framework was put on its modern footing by Jerzy Neyman and Egon Pearson (1933) and Ronald A. Fisher.
46.2 The Two Hypotheses
| Hypothesis | Symbol | Stand |
|---|---|---|
| Null hypothesis | \(H_0\) | The status quo; no effect, no difference |
| Alternative hypothesis | \(H_1\) or \(H_a\) | What the researcher hopes to establish |
The hypotheses must be mutually exclusive and exhaustive. The test seeks evidence to reject \(H_0\) in favour of \(H_1\); absence of evidence is not proof of \(H_0\).
46.3 Type I and Type II Errors
A test never gives certainty; two kinds of error are possible:
| Decision \ Truth | \(H_0\) true | \(H_0\) false |
|---|---|---|
| Reject \(H_0\) | Type I error (\(\alpha\)) | Correct (\(1 - \beta\), power) |
| Do not reject \(H_0\) | Correct (\(1 - \alpha\)) | Type II error (\(\beta\)) |
- \(\alpha\) — Significance level; probability of rejecting a true \(H_0\). Conventional values: 0.05, 0.01, 0.10.
- \(\beta\) — Probability of failing to reject a false \(H_0\).
- Power = \(1 - \beta\) — probability of correctly rejecting a false \(H_0\). Larger sample size raises power.
46.4 One-Tailed vs Two-Tailed Tests
| Test | Form of \(H_1\) | Critical region |
|---|---|---|
| Two-tailed | \(\mu \neq \mu_0\) | Both tails (split \(\alpha\) = \(\alpha/2\) each side) |
| Right-tailed | \(\mu > \mu_0\) | Right tail only |
| Left-tailed | \(\mu < \mu_0\) | Left tail only |
46.5 Procedure of Hypothesis Testing
| Step | Action |
|---|---|
| 1 | Formulate \(H_0\) and \(H_1\) |
| 2 | Choose the level of significance \(\alpha\) |
| 3 | Identify the appropriate test statistic and its sampling distribution |
| 4 | Compute the test statistic from the sample |
| 5 | Compare with the critical value (or compute p-value) |
| 6 | Take a decision and state the conclusion |
46.6 Critical Value vs p-Value Approach
Two equivalent decision rules:
- Critical-value approach — reject \(H_0\) if the test statistic falls in the critical region.
- p-value approach — reject \(H_0\) if \(p < \alpha\).
The p-value is the probability of observing a test statistic as extreme as, or more extreme than, the one observed, assuming \(H_0\) is true.
46.7 Major Tests — A Compact Map
| Situation | Test | Statistic |
|---|---|---|
| Test of mean, \(\sigma\) known, large \(n\) | Z-test | \(z = \dfrac{\bar X - \mu_0}{\sigma/\sqrt n}\) |
| Test of mean, \(\sigma\) unknown, small \(n\) | t-test | \(t = \dfrac{\bar X - \mu_0}{s/\sqrt n}\) |
| Test of two means | Independent / paired t-test | \(t = \dfrac{\bar X_1 - \bar X_2}{\text{SE}}\) |
| Test of proportion | Z-test | \(z = \dfrac{p - P_0}{\sqrt{P_0(1-P_0)/n}}\) |
| Test of variance / Goodness of fit | \(\chi^2\) test | \(\chi^2 = \sum \dfrac{(O - E)^2}{E}\) |
| Test of variances or ANOVA | F-test | \(F = \dfrac{s_1^2}{s_2^2}\) or MSB / MSW |
| Independence in contingency table | \(\chi^2\) test | \(\chi^2 = \sum \dfrac{(O - E)^2}{E}\), df = \((r-1)(c-1)\) |
46.8 Parametric vs Non-Parametric Tests
| Family | Assumes population distribution? | Examples |
|---|---|---|
| Parametric | Yes — usually normal | t, Z, F, ANOVA, Pearson correlation |
| Non-parametric | No — distribution-free | Chi-square, Sign test, Wilcoxon, Mann-Whitney, Kruskal-Wallis, Spearman rank, Run test |
46.9 ANOVA — Analysis of Variance
R.A. Fisher’s ANOVA tests whether the means of three or more groups are equal. One-way ANOVA compares groups by a single factor; two-way ANOVA by two factors.
The F-statistic:
\[ F = \dfrac{\text{Mean Square Between (MSB)}}{\text{Mean Square Within (MSW)}} \]
A large \(F\) indicates that variation between groups dominates variation within groups, leading to rejection of equality.
46.10 Worked Example — Z-test for Mean
A factory claims its bulbs last 1,000 hours on average (\(\sigma = 50\)). A sample of \(n = 100\) bulbs averages 990 hours. Test at \(\alpha = 0.05\).
- \(H_0: \mu = 1000\); \(H_1: \mu \neq 1000\) (two-tailed).
- \(z = (990 - 1000) / (50 / \sqrt{100}) = -10 / 5 = -2\).
- Critical value at \(\alpha = 0.05\) (two-tailed) = \(\pm 1.96\).
- \(|z| = 2 > 1.96\) → reject \(H_0\) — the claim is not supported.
46.11 Exam-Pattern MCQs
View solution
| Test | Situation | ||
| (i) | Z-test | (a) | Goodness-of-fit |
| (ii) | t-test | (b) | Mean, $\sigma$ unknown, small $n$ |
| (iii) | $\chi^2$ test | (c) | Mean, $\sigma$ known, large $n$ |
| (iv) | F-test | (d) | Compare three or more group means |
View solution
View solution
View solution
| Family | Assumption | ||
| (i) | Parametric | (a) | No specific distributional assumption |
| (ii) | Non-parametric | (b) | Population distribution (usually normal) |
View solution
View solution
View solution
| Non-parametric | Analogue | ||
| (i) | Wilcoxon signed-rank | (a) | Pearson correlation |
| (ii) | Mann-Whitney | (b) | Paired t-test |
| (iii) | Kruskal-Wallis | (c) | Independent t-test |
| (iv) | Spearman rank | (d) | One-way ANOVA |
View solution
- Hypothesis test — compares \(H_0\) (status quo) and \(H_1\) (researcher’s claim) using sample data.
- Type I error (reject true \(H_0\), \(\alpha\)); Type II error (fail to reject false \(H_0\), \(\beta\)). Power = 1 − \(\beta\).
- 6 steps: Formulate → Choose \(\alpha\) → Identify test → Compute → Compare → Decide.
- Tests: Z (mean, \(\sigma\) known, large \(n\)); t (mean, \(\sigma\) unknown, small \(n\)); \(\chi^2\) (goodness-of-fit, independence); F / ANOVA (variances, three+ means); Z for proportion.
- 95 % two-tailed critical Z = ±1.96. 99 % = ±2.58. 90 % = ±1.645.
- \(\chi^2\): \(\sum (O - E)^2 / E\), df = \((r-1)(c-1)\) for contingency.
- ANOVA F = MSB / MSW.
- Parametric assumes distribution (usually normal); non-parametric does not.
- Non-parametric ↔︎ parametric: Wilcoxon ↔︎ paired t; Mann-Whitney ↔︎ independent t; Kruskal-Wallis ↔︎ one-way ANOVA; Spearman ↔︎ Pearson.
- p-value = probability of test statistic at least as extreme, given \(H_0\). Reject if p < \(\alpha\).