flowchart LR
PR[Prior P(A)] -->|update| PO[Posterior P(A|B)]
L[Likelihood P(B|A)] -->|times prior| PO
E[Evidence P(B)] -->|divides| PO
classDef default fill:#003366,color:#ffffff,stroke:#ffcc00,stroke-width:3px,rx:10px,ry:10px;
42 Probability: Approaches to probability; Bayes’ theorem
42.1 Concept of Probability
Probability is a numerical measure of uncertainty — a number between 0 and 1 (or 0 % and 100 %) expressing the likelihood of an event. It is the mathematical language of chance and the foundation of all inferential statistics. The roots of probability theory go back to the 17th-century correspondence between Blaise Pascal and Pierre de Fermat on gambling problems (1654). It was axiomatised in modern form by A.N. Kolmogorov in 1933. Four major approaches to probability have emerged — classical, empirical (relative frequency), subjective, and axiomatic. Bayes’ theorem (Thomas Bayes, posthumously published 1763) provides the rule for updating probabilities in the light of new information — the foundation of all modern Bayesian inference.
42.2 Approaches to Probability
| Approach | Working content |
|---|---|
| Classical (a priori) | P(A) = favourable outcomes / total outcomes, assuming equally likely outcomes (Laplace) |
| Empirical (Relative Frequency) | P(A) = number of times A occurs / total trials, as number of trials → ∞ (von Mises) |
| Subjective | P(A) = degree of personal belief; basis of Bayesian school (Ramsey, de Finetti) |
| Axiomatic | Kolmogorov’s three axioms (1933) |
42.3 Kolmogorov’s Axioms
- Non-negativity: P(A) ≥ 0 for every event A.
- Normalisation: P(S) = 1, where S is the sample space.
- Additivity: For mutually exclusive events A₁, A₂, …, \(P(A_1 \cup A_2 \cup \ldots) = P(A_1) + P(A_2) + \ldots\)
42.4 Basic Terms
| Term | Meaning |
|---|---|
| Random experiment | Process with uncertain outcome |
| Sample space (S) | Set of all possible outcomes |
| Event (A) | Subset of S |
| Mutually exclusive | Events that cannot both occur |
| Exhaustive | Events that together cover S |
| Independent events | Occurrence of one does not affect probability of another |
| Complementary events | A and Aᶜ; P(A) + P(Aᶜ) = 1 |
| Conditional probability | P(A |
42.5 Key Theorems
42.5.1 Addition Rule (Union of Events)
- General: \(P(A \cup B) = P(A) + P(B) - P(A \cap B)\).
- Mutually exclusive: \(P(A \cup B) = P(A) + P(B)\).
42.5.2 Multiplication Rule (Intersection)
- General: \(P(A \cap B) = P(A) \cdot P(B \mid A)\).
- Independent: \(P(A \cap B) = P(A) \cdot P(B)\).
42.6 Bayes’ Theorem (1763)
For events A and B (with P(B) > 0):
\[P(A | B) = \frac{P(B | A) \cdot P(A)}{P(B)}\]
For a partition {A₁, A₂, …, Aₙ} of S:
\[P(A_i | B) = \frac{P(B | A_i) \cdot P(A_i)}{\sum_{j} P(B | A_j) \cdot P(A_j)}\]
42.6.1 Terminology
- Prior P(A_i) — probability before seeing evidence B.
- Likelihood P(B|A_i) — probability of evidence B given A_i.
- Posterior P(A_i|B) — updated probability after seeing B.
- Marginal / Evidence P(B) — total probability of B (denominator).
42.6.2 Classic Example — Diagnostic Test
A disease has prevalence 1 %; test has sensitivity 99 % (P(+|D) = 0.99) and specificity 95 % (P(−|H) = 0.95 → false positive 5 %). What is the probability a positive-tested person actually has the disease?
P(D|+) = (0.99 × 0.01) / [(0.99 × 0.01) + (0.05 × 0.99)] ≈ 0.167 or 16.7 %.
This counter-intuitive result — only ~17 % of positive tests are true positives despite a 99 %-accurate test — is the classic illustration of base-rate neglect.
PYQs often confuse mutually exclusive with independent. Mutually exclusive means cannot both happen; independent means occurrence of one doesn’t affect probability of the other. They are opposites in most practical cases — if A and B both have positive probability, mutual exclusion implies dependence (knowing A occurred means B did not).
42.7 Practice Questions
Probability of an event lies between:
View solution
The axiomatic foundation of probability was provided by:
View solution
Match each approach with its proponent:
| Approach | Proponent | ||
| (i) | Classical | (a) | Bayes / Ramsey / de Finetti |
| (ii) | Relative frequency | (b) | Kolmogorov |
| (iii) | Subjective | (c) | Laplace |
| (iv) | Axiomatic | (d) | von Mises |
View solution
For two mutually exclusive events A and B:
View solution
For two *independent* events:
View solution
Probability of getting two heads in two tosses of a fair coin:
View solution
P(getting a 6 on a fair die) is:
View solution
From a standard 52-card deck, P(drawing a King OR a Heart) is:
View solution
Bayes' theorem allows us to:
View solution
Thomas Bayes' "Essay" was published posthumously in:
View solution
P(A|B) = P(A∩B)/P(B). This is the formula for:
View solution
Two urns: Urn 1 has 3 red, 2 blue; Urn 2 has 1 red, 4 blue. An urn is chosen randomly and a red ball drawn. P(it was Urn 1):
View solution
If P(A) = 0.3, then P(Aᶜ) is:
View solution
The origins of probability theory are credited to 17th-century correspondence between:
View solution
The famous *base-rate fallacy* arises in:
View solution
Two events are **exhaustive** if:
View solution
P(getting an odd number on a fair die) is:
View solution
In Bayes' framework, *prior* means:
View solution
In tossing two dice, P(sum = 7) is:
View solution
Two events with positive probability that are *mutually exclusive* must be:
View solution
42.8 Quick Recall
- Probability ∈ [0, 1]; language of uncertainty. Origin: Pascal-Fermat 1654.
- Four approaches: Classical (Laplace), Empirical (von Mises), Subjective (Bayes/Ramsey/de Finetti), Axiomatic (Kolmogorov 1933).
- Three axioms: non-negativity, normalisation (P(S) = 1), additivity for mutually exclusive events.
- Addition rule: P(A∪B) = P(A) + P(B) − P(A∩B); mutually exclusive ⇒ no overlap term.
- Multiplication rule: P(A∩B) = P(A) × P(B|A); independent ⇒ P(A) × P(B).
- Complement: P(A) + P(Aᶜ) = 1.
- Bayes (1763): P(A|B) = P(B|A) × P(A) / P(B). Prior × Likelihood / Evidence = Posterior.
- Base-rate fallacy — ignoring prior probability when interpreting test results.
- Mutually exclusive ≠ Independent (with positive probabilities, M.E. implies dependence).