The Law of Large Numbers
Ten heads in a row? Not weird. Ten thousand flips? Almost exactly half. The reason data can ever count as evidence sits right here.
Law of Large Numbers — Converging to Truth
10 heads in a row at the start of a coin-flip? Not that weird.
But flip it 10,000 times and the head-ratio locks onto almost exactly 0.5.
That's the Law of Large Numbers — the more samples you draw, the more observed values get pulled toward the truth.
This is why statistics counts as evidence, not a vague hunch.
- Step 1: Set p=0.5, hit ▶ → the line wobbles wildly at first, then gets pulled toward 0.5.
- Step 2: RESET and run again → the early path is different every time, but it always converges.
- Step 3: Change p to 0.8 and simulate → the red line now converges to 0.8.
- Step 4: Set p to 0.05 (rare event) → it hugs zero early on, but still converges to p. The law holds.
// Formula used here
Left side in plain English
• "The probability that the average height of n people deviates from the true population mean by ε cm or more"
What the right side tells us
• σ²/nε²: an upper bound on that probability. The actual probability may be smaller, but it's at most this value
• n is in the denominator → as n grows, the upper bound shrinks toward zero
• Translation: "collect more samples and the mean can be made as close to the truth as you like" — that's the Law of Large Numbers
Worked example (coin flips)
• p = 0.5 (probability of heads), σ² = 0.25, ε = 0.01 (within 1%)
• n = 100: P ≤ 0.25/(100 × 0.0001) = 25 → the bound exceeds 1, so it says nothing useful
• n = 100,000: P ≤ 0.025 → at most 2.5%
• n = 1,000,000: P ≤ 0.0025 → at most 0.25%. The mean is virtually pinned to the truth
• Chebyshev's inequality gives a loose bound; in practice convergence is much faster
// Why this theorem is statistics' "license to operate"
It looks unassuming, but without the LLN, "inferring the whole from a sample" has no justification. The foundation of statistics crumbles.
- A poll asks 1,000 people and reports "48% approval" — but without the LLN, there's no guarantee this is close to the real number
- A drug trial shows "high improvement rate" — could be pure chance
- An insurance company estimates accident rates from historical data to set premiums — that falls apart too
The LLN guarantees that "as n grows, the sample mean approaches the population mean." That's what makes all of the above legitimate. A humble theorem, but everything breaks without it.
// Watch out for the gambler's fallacy
Five heads in a row doesn't make tails more likely on the sixth flip — it's still 0.5. The LLN doesn't say "the next one will balance things out." It says "over thousands of flips, the proportion converges to 0.5." Individual trials remain random.
// LLN vs. Central Limit Theorem — easy to mix these up
| LLN (Law of Large Numbers) | Central Limit Theorem | |
|---|---|---|
| About what? | The mean's value approaches the truth | The mean's distribution shape approaches normal |
| One-liner | "Where it's heading" | "How it spreads along the way" |
| Requirement | The mean must exist | Both mean and variance must exist |
In short: the LLN tells you the destination; the Central Limit Theorem tells you the shape of the road.
// Common misconceptions
The LLN says the proportion approaches 0.5, not the count. The absolute gap |heads − 50| can actually grow. 100 flips: 48 heads (48%, gap 2). 10,000 flips: 4,950 heads (49.5%, gap 50). The proportion got closer, but the gap widened.
The LLN only says "the mean approaches the truth." It can't answer "how precise is it?" or "what's the shape of the error?" — that's the Central Limit Theorem's job. You need both.
It requires the mean to exist. For the Cauchy distribution, the mean is undefined, and the sample mean doesn't converge no matter how much data you collect.
// Shapes you'll meet again
Around the Law of Large Numbers, the same upper-bound formula and the same comparison with its "twin" theorem keep reappearing.
- Chebyshev's upper bound shape: with σ²=4 and n=100, P(|X̄−μ|≥1) ≤ 4/(100×1) = 0.04. "Variance ÷ (n × ε²)" is the structure, and n in the denominator drains the bound as n grows
- The question LLN answers: "Does the sample mean approach the population mean?" — LLN appears as the answer to that question. Its shape is "guaranteeing the destination"
- How LLN and the Central Limit Theorem split the work: "the value of the mean approaches the truth" belongs to LLN; "the shape of the mean's distribution becomes normal" belongs to the Central Limit Theorem. The two roles appear together as a paired shape