42 Probability Distributions

42.1 Random Variables and Distributions

A random variable is a function that assigns a numerical value to each outcome of a random experiment. A probability distribution lists the possible values of the random variable along with their probabilities (gupta2021?; ross2020?).

Discrete vs Continuous Random Variables

Type	Definition	Example	Tool
Discrete	Takes countable values	Number of defective items	Probability Mass Function (PMF)
Continuous	Takes values on an interval	Height, time, temperature	Probability Density Function (PDF)

For a discrete RV: $\sum P(X = x) = 1$. For a continuous RV: $\int f(x) dx = 1$.

42.2 Expectation, Variance and Standard Deviation

Moments of a Random Variable

Moment	Discrete	Continuous
Mean / Expectation $\mu = E(X)$	$\sum x P(x)$	$\int x f(x) dx$
Variance $\sigma^2 = E[(X - \mu)^2]$	$\sum (x - \mu)^2 P(x)$	$\int (x - \mu)^2 f(x) dx$
SD	$\sigma = \sqrt{\sigma^2}$	same

A useful identity: $\sigma^2 = E(X^2) - [E(X)]^2$.

42.3 Major Discrete Distributions

42.3.1 Binomial Distribution

The binomial describes the number of successes in $n$ independent Bernoulli trials, each with success probability $p$:

\[ P(X = r) = \binom{n}{r} p^r q^{n-r}, \quad q = 1 - p \]

Binomial — Key Properties

Property	Value
Parameters	$n$ (number of trials), $p$ (success probability)
Mean	$np$
Variance	$npq$
Mode	Around $(n+1)p$
Skewness	$(q - p) / \sqrt{npq}$
Kurtosis	$3 + (1 - 6pq)/(npq)$
Symmetric when	$p = 0.5$

42.3.2 Poisson Distribution

The Poisson describes the number of rare events occurring in a fixed interval, with mean rate $\lambda$:

\[ P(X = r) = \dfrac{e^{-\lambda} \lambda^r}{r!}, \quad r = 0, 1, 2, \dots \]

Poisson — Key Properties

Property	Value
Parameter	$\lambda$ (mean rate)
Mean	$\lambda$
Variance	$\lambda$
Mean = Variance	Hallmark of the Poisson
Skewness	$1 / \sqrt{\lambda}$

The Poisson is the limiting case of the binomial when $n \to \infty$ and $p \to 0$ such that $np = \lambda$ stays constant. It applies to rare events — accidents per day, calls per minute, defects per unit area.

42.4 Major Continuous Distributions

42.4.1 Normal Distribution

The normal (or Gaussian) distribution, due to Carl Friedrich Gauss and Pierre-Simon Laplace, has the bell-shaped PDF:

\[ f(x) = \dfrac{1}{\sigma \sqrt{2\pi}} \exp\left(-\dfrac{(x - \mu)^2}{2 \sigma^2}\right) \]

Normal Distribution — Key Properties

Property	Value
Parameters	$\mu$ (mean), $\sigma$ (SD)
Mean = Median = Mode	$\mu$
Symmetry	About the mean
Skewness	0
Kurtosis	3 (mesokurtic)
Empirical rule	68 % within $\mu \pm \sigma$; 95 % within $\mu \pm 2\sigma$; 99.7 % within $\mu \pm 3\sigma$
Standard normal	$Z = (X - \mu)/\sigma$ has $\mu = 0$, $\sigma = 1$

42.4.2 Other continuous distributions

Other Important Continuous Distributions

Distribution	Use
Uniform	Equal probability over an interval
Exponential	Time between Poisson events; reliability analysis
Gamma	Generalisation of exponential
Beta	Bounded random variables; Bayesian priors
Chi-square ($\chi^2$)	Sum of squared standard normals; goodness of fit
Student’s t	Small-sample inference; Gosset 1908
F	Ratio of two chi-squares; ANOVA

42.5 Central Limit Theorem (CLT)

The Central Limit Theorem — perhaps the single most important result in statistics — says that the sum (or mean) of a large number of independent random variables, regardless of their underlying distribution, tends to a normal distribution. Formally:

\[ \bar X \approx N\left(\mu, \dfrac{\sigma^2}{n}\right) \quad \text{for large } n \]

The CLT explains why the normal distribution shows up so often in nature and in inferential statistics — sample means, regression residuals, measurement errors all approach normality.

42.6 Worked Numerical

A factory produces items with 4 % defective rate. In a sample of 50 items:

Mean number of defectives = $np = 50 \times 0.04 = 2$.
SD = $\sqrt{npq} = \sqrt{50 \times 0.04 \times 0.96} = \sqrt{1.92} \approx 1.386$.
Probability of exactly 0 defective = $\binom{50}{0} (0.04)^0 (0.96)^{50} = 0.96^{50} \approx 0.130$.
Approximate by Poisson with $\lambda = 2$: $P(X=0) = e^{-2} \approx 0.135$ — close.

42.7 Exam-Pattern MCQs

Eight-question set

Q1. Which of the following is not a property of the normal distribution?

A. Mean = Median = Mode B. Symmetric about the mean C. Skewness = 0 D. Kurtosis = 0

Answer: D. The normal distribution has kurtosis = 3 (mesokurtic); excess kurtosis is 0.

Q2. Match each distribution with its application:

	Distribution		Application
(i)	Binomial	(a)	Time between rare events
(ii)	Poisson	(b)	Number of successes in $n$ Bernoulli trials
(iii)	Exponential	(c)	Number of rare events in a fixed interval
(iv)	Normal	(d)	Continuous bell-shaped distribution

A. (i)-(b), (ii)-(c), (iii)-(a), (iv)-(d) B. (i)-(a), (ii)-(b), (iii)-(c), (iv)-(d) C. (i)-(c), (ii)-(d), (iii)-(b), (iv)-(a) D. (i)-(d), (ii)-(a), (iii)-(c), (iv)-(b)

Answer: A.

Q3. A binomial distribution has $n = 10$ and $p = 0.3$. Its mean and SD are:

A. 3 and 1.45 B. 3 and 2.1 C. 30 and 1.45 D. 1.45 and 3

Answer: A. Mean = $np = 3$; SD = $ = ≈ $ 1.45.

Q4. In a Poisson distribution with $\lambda = 4$, the mean equals:

A. 4 B. 16 C. 2 D. 0

Answer: A. For Poisson, Mean = Variance = $\lambda$ = 4.

Q5. In a normal distribution, approximately what percentage of observations lie within one standard deviation of the mean?

A. 50 % B. 68 % C. 95 % D. 99.7 %

Answer: B. The empirical rule: ≈ 68 % within $\mu \pm \sigma$.

Q6. Which of the following is the defining identity of the Poisson distribution?

A. Mean = Variance B. Mean = Mode = Median C. Variance = Mean$^2$ D. Skewness = 0

Answer: A. The Poisson is the only common distribution in which mean and variance are equal — both equal $\lambda$.

Q7. The Central Limit Theorem says that for a sample of size $n$ drawn from a population with mean $\mu$ and variance $\sigma^2$:

A. The sample mean has a Poisson distribution B. The sample mean is approximately normal with mean $\mu$ and variance $\sigma^2 / n$ C. The sample mean has the same distribution as the population D. The sample mean has variance $\sigma^2 \cdot n$

Answer: B. CLT: $\bar X$ is approximately $N(\mu, \sigma^2/n)$ for large $n$.

Q8. Match each distribution with the year / contributor:

	Distribution		Contributor / Year
(i)	Normal	(a)	Siméon Denis Poisson, 1837
(ii)	Poisson	(b)	William Sealy Gosset (Student), 1908
(iii)	Student’s t	(c)	Gauss & Laplace, late 18th–early 19th c.
(iv)	Binomial	(d)	Jacob Bernoulli, 1713

A. (i)-(c), (ii)-(a), (iii)-(b), (iv)-(d) B. (i)-(a), (ii)-(b), (iii)-(c), (iv)-(d) C. (i)-(b), (ii)-(d), (iii)-(c), (iv)-(a) D. (i)-(d), (ii)-(c), (iii)-(a), (iv)-(b)

Answer: A.

Quick recall

Random variable: function from sample space to real numbers. Discrete (PMF), Continuous (PDF).
$\mu = E(X)$, $\sigma^2 = E[(X - \mu)^2] = E(X^2) - \mu^2$.
Binomial: $P(X = r) = \binom{n}{r} p^r q^{n-r}$. Mean = $np$, Variance = $npq$.
Poisson: $P(X = r) = e^{-\lambda} \lambda^r / r!$. Mean = Variance = $\lambda$.
Normal: bell-shaped, parameters $(\mu, \sigma)$; skewness 0, kurtosis 3. 68-95-99.7 rule.
Standard normal: $Z = (X - \mu)/\sigma$.
Other continuous: Uniform, Exponential, Gamma, Beta, Chi-square, t (Gosset 1908), F.
Central Limit Theorem: sample mean is approximately normal with mean $\mu$ and variance $\sigma^2/n$.
Binomial → Poisson when $n \to \infty$, $p \to 0$, $np = \lambda$.
Binomial → Normal when $np$ and $nq$ both ≥ 10 (de Moivre-Laplace).

Moment	Discrete	Continuous
Mean / Expectation \(\mu = E(X)\)	\(\sum x P(x)\)	\(\int x f(x) dx\)
Variance \(\sigma^2 = E[(X - \mu)^2]\)	\(\sum (x - \mu)^2 P(x)\)	\(\int (x - \mu)^2 f(x) dx\)
SD	\(\sigma = \sqrt{\sigma^2}\)	same