42 Probability Distributions
42.1 Random Variables and Distributions
A random variable is a function that assigns a numerical value to each outcome of a random experiment. A probability distribution lists the possible values of the random variable along with their probabilities (gupta2021?; ross2020?).
| Type | Definition | Example | Tool |
|---|---|---|---|
| Discrete | Takes countable values | Number of defective items | Probability Mass Function (PMF) |
| Continuous | Takes values on an interval | Height, time, temperature | Probability Density Function (PDF) |
For a discrete RV: \(\sum P(X = x) = 1\). For a continuous RV: \(\int f(x) dx = 1\).
42.2 Expectation, Variance and Standard Deviation
| Moment | Discrete | Continuous |
|---|---|---|
| Mean / Expectation \(\mu = E(X)\) | \(\sum x P(x)\) | \(\int x f(x) dx\) |
| Variance \(\sigma^2 = E[(X - \mu)^2]\) | \(\sum (x - \mu)^2 P(x)\) | \(\int (x - \mu)^2 f(x) dx\) |
| SD | \(\sigma = \sqrt{\sigma^2}\) | same |
A useful identity: \(\sigma^2 = E(X^2) - [E(X)]^2\).
42.3 Major Discrete Distributions
42.3.1 Binomial Distribution
The binomial describes the number of successes in \(n\) independent Bernoulli trials, each with success probability \(p\):
\[ P(X = r) = \binom{n}{r} p^r q^{n-r}, \quad q = 1 - p \]
| Property | Value |
|---|---|
| Parameters | \(n\) (number of trials), \(p\) (success probability) |
| Mean | \(np\) |
| Variance | \(npq\) |
| Mode | Around \((n+1)p\) |
| Skewness | \((q - p) / \sqrt{npq}\) |
| Kurtosis | \(3 + (1 - 6pq)/(npq)\) |
| Symmetric when | \(p = 0.5\) |
42.3.2 Poisson Distribution
The Poisson describes the number of rare events occurring in a fixed interval, with mean rate \(\lambda\):
\[ P(X = r) = \dfrac{e^{-\lambda} \lambda^r}{r!}, \quad r = 0, 1, 2, \dots \]
| Property | Value |
|---|---|
| Parameter | \(\lambda\) (mean rate) |
| Mean | \(\lambda\) |
| Variance | \(\lambda\) |
| Mean = Variance | Hallmark of the Poisson |
| Skewness | \(1 / \sqrt{\lambda}\) |
The Poisson is the limiting case of the binomial when \(n \to \infty\) and \(p \to 0\) such that \(np = \lambda\) stays constant. It applies to rare events — accidents per day, calls per minute, defects per unit area.
42.4 Major Continuous Distributions
42.4.1 Normal Distribution
The normal (or Gaussian) distribution, due to Carl Friedrich Gauss and Pierre-Simon Laplace, has the bell-shaped PDF:
\[ f(x) = \dfrac{1}{\sigma \sqrt{2\pi}} \exp\left(-\dfrac{(x - \mu)^2}{2 \sigma^2}\right) \]
| Property | Value |
|---|---|
| Parameters | \(\mu\) (mean), \(\sigma\) (SD) |
| Mean = Median = Mode | \(\mu\) |
| Symmetry | About the mean |
| Skewness | 0 |
| Kurtosis | 3 (mesokurtic) |
| Empirical rule | 68 % within \(\mu \pm \sigma\); 95 % within \(\mu \pm 2\sigma\); 99.7 % within \(\mu \pm 3\sigma\) |
| Standard normal | \(Z = (X - \mu)/\sigma\) has \(\mu = 0\), \(\sigma = 1\) |
42.4.2 Other continuous distributions
| Distribution | Use |
|---|---|
| Uniform | Equal probability over an interval |
| Exponential | Time between Poisson events; reliability analysis |
| Gamma | Generalisation of exponential |
| Beta | Bounded random variables; Bayesian priors |
| Chi-square (\(\chi^2\)) | Sum of squared standard normals; goodness of fit |
| Student’s t | Small-sample inference; Gosset 1908 |
| F | Ratio of two chi-squares; ANOVA |
42.5 Central Limit Theorem (CLT)
The Central Limit Theorem — perhaps the single most important result in statistics — says that the sum (or mean) of a large number of independent random variables, regardless of their underlying distribution, tends to a normal distribution. Formally:
\[ \bar X \approx N\left(\mu, \dfrac{\sigma^2}{n}\right) \quad \text{for large } n \]
The CLT explains why the normal distribution shows up so often in nature and in inferential statistics — sample means, regression residuals, measurement errors all approach normality.
42.6 Worked Numerical
A factory produces items with 4 % defective rate. In a sample of 50 items:
- Mean number of defectives = \(np = 50 \times 0.04 = 2\).
- SD = \(\sqrt{npq} = \sqrt{50 \times 0.04 \times 0.96} = \sqrt{1.92} \approx 1.386\).
- Probability of exactly 0 defective = \(\binom{50}{0} (0.04)^0 (0.96)^{50} = 0.96^{50} \approx 0.130\).
- Approximate by Poisson with \(\lambda = 2\): \(P(X=0) = e^{-2} \approx 0.135\) — close.
42.7 Exam-Pattern MCQs
Q1. Which of the following is not a property of the normal distribution?
A. Mean = Median = Mode B. Symmetric about the mean C. Skewness = 0 D. Kurtosis = 0
Answer: D. The normal distribution has kurtosis = 3 (mesokurtic); excess kurtosis is 0.
Q2. Match each distribution with its application:
| Distribution | Application | ||
|---|---|---|---|
| (i) | Binomial | (a) | Time between rare events |
| (ii) | Poisson | (b) | Number of successes in \(n\) Bernoulli trials |
| (iii) | Exponential | (c) | Number of rare events in a fixed interval |
| (iv) | Normal | (d) | Continuous bell-shaped distribution |
A. (i)-(b), (ii)-(c), (iii)-(a), (iv)-(d) B. (i)-(a), (ii)-(b), (iii)-(c), (iv)-(d) C. (i)-(c), (ii)-(d), (iii)-(b), (iv)-(a) D. (i)-(d), (ii)-(a), (iii)-(c), (iv)-(b)
Answer: A.
Q3. A binomial distribution has \(n = 10\) and \(p = 0.3\). Its mean and SD are:
A. 3 and 1.45 B. 3 and 2.1 C. 30 and 1.45 D. 1.45 and 3
Answer: A. Mean = \(np = 3\); SD = $ = ≈ $ 1.45.
Q4. In a Poisson distribution with \(\lambda = 4\), the mean equals:
A. 4 B. 16 C. 2 D. 0
Answer: A. For Poisson, Mean = Variance = \(\lambda\) = 4.
Q5. In a normal distribution, approximately what percentage of observations lie within one standard deviation of the mean?
A. 50 % B. 68 % C. 95 % D. 99.7 %
Answer: B. The empirical rule: ≈ 68 % within \(\mu \pm \sigma\).
Q6. Which of the following is the defining identity of the Poisson distribution?
A. Mean = Variance B. Mean = Mode = Median C. Variance = Mean\(^2\) D. Skewness = 0
Answer: A. The Poisson is the only common distribution in which mean and variance are equal — both equal \(\lambda\).
Q7. The Central Limit Theorem says that for a sample of size \(n\) drawn from a population with mean \(\mu\) and variance \(\sigma^2\):
A. The sample mean has a Poisson distribution B. The sample mean is approximately normal with mean \(\mu\) and variance \(\sigma^2 / n\) C. The sample mean has the same distribution as the population D. The sample mean has variance \(\sigma^2 \cdot n\)
Answer: B. CLT: \(\bar X\) is approximately \(N(\mu, \sigma^2/n)\) for large \(n\).
Q8. Match each distribution with the year / contributor:
| Distribution | Contributor / Year | ||
|---|---|---|---|
| (i) | Normal | (a) | Siméon Denis Poisson, 1837 |
| (ii) | Poisson | (b) | William Sealy Gosset (Student), 1908 |
| (iii) | Student’s t | (c) | Gauss & Laplace, late 18th–early 19th c. |
| (iv) | Binomial | (d) | Jacob Bernoulli, 1713 |
A. (i)-(c), (ii)-(a), (iii)-(b), (iv)-(d) B. (i)-(a), (ii)-(b), (iii)-(c), (iv)-(d) C. (i)-(b), (ii)-(d), (iii)-(c), (iv)-(a) D. (i)-(d), (ii)-(c), (iii)-(a), (iv)-(b)
Answer: A.
- Random variable: function from sample space to real numbers. Discrete (PMF), Continuous (PDF).
- \(\mu = E(X)\), \(\sigma^2 = E[(X - \mu)^2] = E(X^2) - \mu^2\).
- Binomial: \(P(X = r) = \binom{n}{r} p^r q^{n-r}\). Mean = \(np\), Variance = \(npq\).
- Poisson: \(P(X = r) = e^{-\lambda} \lambda^r / r!\). Mean = Variance = \(\lambda\).
- Normal: bell-shaped, parameters \((\mu, \sigma)\); skewness 0, kurtosis 3. 68-95-99.7 rule.
- Standard normal: \(Z = (X - \mu)/\sigma\).
- Other continuous: Uniform, Exponential, Gamma, Beta, Chi-square, t (Gosset 1908), F.
- Central Limit Theorem: sample mean is approximately normal with mean \(\mu\) and variance \(\sigma^2/n\).
- Binomial → Poisson when \(n \to \infty\), \(p \to 0\), \(np = \lambda\).
- Binomial → Normal when \(np\) and \(nq\) both ≥ 10 (de Moivre-Laplace).