42  Probability Distributions

42.1 Random Variables and Distributions

A random variable is a function that assigns a numerical value to each outcome of a random experiment. A probability distribution lists the possible values of the random variable along with their probabilities (gupta2021?; ross2020?).

TipDiscrete vs Continuous Random Variables
Type Definition Example Tool
Discrete Takes countable values Number of defective items Probability Mass Function (PMF)
Continuous Takes values on an interval Height, time, temperature Probability Density Function (PDF)

For a discrete RV: \(\sum P(X = x) = 1\). For a continuous RV: \(\int f(x) dx = 1\).

42.2 Expectation, Variance and Standard Deviation

TipMoments of a Random Variable
Moment Discrete Continuous
Mean / Expectation \(\mu = E(X)\) \(\sum x P(x)\) \(\int x f(x) dx\)
Variance \(\sigma^2 = E[(X - \mu)^2]\) \(\sum (x - \mu)^2 P(x)\) \(\int (x - \mu)^2 f(x) dx\)
SD \(\sigma = \sqrt{\sigma^2}\) same

A useful identity: \(\sigma^2 = E(X^2) - [E(X)]^2\).

42.3 Major Discrete Distributions

42.3.1 Binomial Distribution

The binomial describes the number of successes in \(n\) independent Bernoulli trials, each with success probability \(p\):

\[ P(X = r) = \binom{n}{r} p^r q^{n-r}, \quad q = 1 - p \]

TipBinomial — Key Properties
Property Value
Parameters \(n\) (number of trials), \(p\) (success probability)
Mean \(np\)
Variance \(npq\)
Mode Around \((n+1)p\)
Skewness \((q - p) / \sqrt{npq}\)
Kurtosis \(3 + (1 - 6pq)/(npq)\)
Symmetric when \(p = 0.5\)

42.3.2 Poisson Distribution

The Poisson describes the number of rare events occurring in a fixed interval, with mean rate \(\lambda\):

\[ P(X = r) = \dfrac{e^{-\lambda} \lambda^r}{r!}, \quad r = 0, 1, 2, \dots \]

TipPoisson — Key Properties
Property Value
Parameter \(\lambda\) (mean rate)
Mean \(\lambda\)
Variance \(\lambda\)
Mean = Variance Hallmark of the Poisson
Skewness \(1 / \sqrt{\lambda}\)

The Poisson is the limiting case of the binomial when \(n \to \infty\) and \(p \to 0\) such that \(np = \lambda\) stays constant. It applies to rare events — accidents per day, calls per minute, defects per unit area.

42.4 Major Continuous Distributions

42.4.1 Normal Distribution

The normal (or Gaussian) distribution, due to Carl Friedrich Gauss and Pierre-Simon Laplace, has the bell-shaped PDF:

\[ f(x) = \dfrac{1}{\sigma \sqrt{2\pi}} \exp\left(-\dfrac{(x - \mu)^2}{2 \sigma^2}\right) \]

TipNormal Distribution — Key Properties
Property Value
Parameters \(\mu\) (mean), \(\sigma\) (SD)
Mean = Median = Mode \(\mu\)
Symmetry About the mean
Skewness 0
Kurtosis 3 (mesokurtic)
Empirical rule 68 % within \(\mu \pm \sigma\); 95 % within \(\mu \pm 2\sigma\); 99.7 % within \(\mu \pm 3\sigma\)
Standard normal \(Z = (X - \mu)/\sigma\) has \(\mu = 0\), \(\sigma = 1\)

42.4.2 Other continuous distributions

TipOther Important Continuous Distributions
Distribution Use
Uniform Equal probability over an interval
Exponential Time between Poisson events; reliability analysis
Gamma Generalisation of exponential
Beta Bounded random variables; Bayesian priors
Chi-square (\(\chi^2\)) Sum of squared standard normals; goodness of fit
Student’s t Small-sample inference; Gosset 1908
F Ratio of two chi-squares; ANOVA

42.5 Central Limit Theorem (CLT)

The Central Limit Theorem — perhaps the single most important result in statistics — says that the sum (or mean) of a large number of independent random variables, regardless of their underlying distribution, tends to a normal distribution. Formally:

\[ \bar X \approx N\left(\mu, \dfrac{\sigma^2}{n}\right) \quad \text{for large } n \]

The CLT explains why the normal distribution shows up so often in nature and in inferential statistics — sample means, regression residuals, measurement errors all approach normality.

42.6 Worked Numerical

A factory produces items with 4 % defective rate. In a sample of 50 items:

  • Mean number of defectives = \(np = 50 \times 0.04 = 2\).
  • SD = \(\sqrt{npq} = \sqrt{50 \times 0.04 \times 0.96} = \sqrt{1.92} \approx 1.386\).
  • Probability of exactly 0 defective = \(\binom{50}{0} (0.04)^0 (0.96)^{50} = 0.96^{50} \approx 0.130\).
  • Approximate by Poisson with \(\lambda = 2\): \(P(X=0) = e^{-2} \approx 0.135\) — close.

42.7 Exam-Pattern MCQs

NoteEight-question set

Q1. Which of the following is not a property of the normal distribution?

A. Mean = Median = Mode B. Symmetric about the mean C. Skewness = 0 D. Kurtosis = 0

Answer: D. The normal distribution has kurtosis = 3 (mesokurtic); excess kurtosis is 0.


Q2. Match each distribution with its application:

Distribution Application
(i) Binomial (a) Time between rare events
(ii) Poisson (b) Number of successes in \(n\) Bernoulli trials
(iii) Exponential (c) Number of rare events in a fixed interval
(iv) Normal (d) Continuous bell-shaped distribution

A. (i)-(b), (ii)-(c), (iii)-(a), (iv)-(d) B. (i)-(a), (ii)-(b), (iii)-(c), (iv)-(d) C. (i)-(c), (ii)-(d), (iii)-(b), (iv)-(a) D. (i)-(d), (ii)-(a), (iii)-(c), (iv)-(b)

Answer: A.


Q3. A binomial distribution has \(n = 10\) and \(p = 0.3\). Its mean and SD are:

A. 3 and 1.45 B. 3 and 2.1 C. 30 and 1.45 D. 1.45 and 3

Answer: A. Mean = \(np = 3\); SD = $ = ≈ $ 1.45.


Q4. In a Poisson distribution with \(\lambda = 4\), the mean equals:

A. 4 B. 16 C. 2 D. 0

Answer: A. For Poisson, Mean = Variance = \(\lambda\) = 4.


Q5. In a normal distribution, approximately what percentage of observations lie within one standard deviation of the mean?

A. 50 % B. 68 % C. 95 % D. 99.7 %

Answer: B. The empirical rule: ≈ 68 % within \(\mu \pm \sigma\).


Q6. Which of the following is the defining identity of the Poisson distribution?

A. Mean = Variance B. Mean = Mode = Median C. Variance = Mean\(^2\) D. Skewness = 0

Answer: A. The Poisson is the only common distribution in which mean and variance are equal — both equal \(\lambda\).


Q7. The Central Limit Theorem says that for a sample of size \(n\) drawn from a population with mean \(\mu\) and variance \(\sigma^2\):

A. The sample mean has a Poisson distribution B. The sample mean is approximately normal with mean \(\mu\) and variance \(\sigma^2 / n\) C. The sample mean has the same distribution as the population D. The sample mean has variance \(\sigma^2 \cdot n\)

Answer: B. CLT: \(\bar X\) is approximately \(N(\mu, \sigma^2/n)\) for large \(n\).


Q8. Match each distribution with the year / contributor:

Distribution Contributor / Year
(i) Normal (a) Siméon Denis Poisson, 1837
(ii) Poisson (b) William Sealy Gosset (Student), 1908
(iii) Student’s t (c) Gauss & Laplace, late 18th–early 19th c.
(iv) Binomial (d) Jacob Bernoulli, 1713

A. (i)-(c), (ii)-(a), (iii)-(b), (iv)-(d) B. (i)-(a), (ii)-(b), (iii)-(c), (iv)-(d) C. (i)-(b), (ii)-(d), (iii)-(c), (iv)-(a) D. (i)-(d), (ii)-(c), (iii)-(a), (iv)-(b)

Answer: A.

ImportantQuick recall
  • Random variable: function from sample space to real numbers. Discrete (PMF), Continuous (PDF).
  • \(\mu = E(X)\), \(\sigma^2 = E[(X - \mu)^2] = E(X^2) - \mu^2\).
  • Binomial: \(P(X = r) = \binom{n}{r} p^r q^{n-r}\). Mean = \(np\), Variance = \(npq\).
  • Poisson: \(P(X = r) = e^{-\lambda} \lambda^r / r!\). Mean = Variance = \(\lambda\).
  • Normal: bell-shaped, parameters \((\mu, \sigma)\); skewness 0, kurtosis 3. 68-95-99.7 rule.
  • Standard normal: \(Z = (X - \mu)/\sigma\).
  • Other continuous: Uniform, Exponential, Gamma, Beta, Chi-square, t (Gosset 1908), F.
  • Central Limit Theorem: sample mean is approximately normal with mean \(\mu\) and variance \(\sigma^2/n\).
  • Binomial → Poisson when \(n \to \infty\), \(p \to 0\), \(np = \lambda\).
  • Binomial → Normal when \(np\) and \(nq\) both ≥ 10 (de Moivre-Laplace).