Skip to main content

The Law of Large Numbers

Ten heads in a row? Not weird. Ten thousand flips? Almost exactly half. The reason data can ever count as evidence sits right here.

I.02 / LAW OF LARGE NUMBERS

Law of Large Numbers — Converging to Truth

The Central Limit Theorem showed that averages become normal. But does the sample mean actually converge to the true mean as n grows? That guarantee is the Law of Large Numbers. The Central Limit Theorem describes the shape; LLN says the center won't run away.

10 heads in a row at the start of a coin-flip? Not that weird. But flip it 10,000 times and the head-ratio locks onto almost exactly 0.5.
That's the Law of Large Numbers — the more samples you draw, the more observed values get pulled toward the truth. This is why statistics counts as evidence, not a vague hunch.

Experiment Guide — try these in order
  1. Step 1: Set p=0.5, hit ▶ → the line wobbles wildly at first, then gets pulled toward 0.5.
  2. Step 2: RESET and run again → the early path is different every time, but it always converges.
  3. Step 3: Change p to 0.8 and simulate → the red line now converges to 0.8.
  4. Step 4: Set p to 0.05 (rare event) → it hugs zero early on, but still converges to p. The law holds.
Trials0
Current mean
Theoretical0.50

// Formula used here

Left side in plain English
• "The probability that the average height of n people deviates from the true population mean by ε cm or more"

What the right side tells us
• σ²/nε²: an upper bound on that probability. The actual probability may be smaller, but it's at most this value
• n is in the denominator → as n grows, the upper bound shrinks toward zero
• Translation: "collect more samples and the mean can be made as close to the truth as you like" — that's the Law of Large Numbers

Worked example (coin flips)
• p = 0.5 (probability of heads), σ² = 0.25, ε = 0.01 (within 1%)
• n = 100: P ≤ 0.25/(100 × 0.0001) = 25 → the bound exceeds 1, so it says nothing useful
• n = 100,000: P ≤ 0.025 → at most 2.5%
• n = 1,000,000: P ≤ 0.0025 → at most 0.25%. The mean is virtually pinned to the truth
• Chebyshev's inequality gives a loose bound; in practice convergence is much faster

// Why this theorem is statistics' "license to operate"

It looks unassuming, but without the LLN, "inferring the whole from a sample" has no justification. The foundation of statistics crumbles.

  • A poll asks 1,000 people and reports "48% approval" — but without the LLN, there's no guarantee this is close to the real number
  • A drug trial shows "high improvement rate" — could be pure chance
  • An insurance company estimates accident rates from historical data to set premiums — that falls apart too

The LLN guarantees that "as n grows, the sample mean approaches the population mean." That's what makes all of the above legitimate. A humble theorem, but everything breaks without it.

// Watch out for the gambler's fallacy

Five heads in a row doesn't make tails more likely on the sixth flip — it's still 0.5. The LLN doesn't say "the next one will balance things out." It says "over thousands of flips, the proportion converges to 0.5." Individual trials remain random.

// LLN vs. Central Limit Theorem — easy to mix these up

LLN (Law of Large Numbers)Central Limit Theorem
About what?The mean's value approaches the truthThe mean's distribution shape approaches normal
One-liner"Where it's heading""How it spreads along the way"
RequirementThe mean must existBoth mean and variance must exist

In short: the LLN tells you the destination; the Central Limit Theorem tells you the shape of the road.

// Common misconceptions

❌ "100 flips should give exactly 50 heads"

The LLN says the proportion approaches 0.5, not the count. The absolute gap |heads − 50| can actually grow. 100 flips: 48 heads (48%, gap 2). 10,000 flips: 4,950 heads (49.5%, gap 50). The proportion got closer, but the gap widened.

❌ "If you have the LLN, you don't need the Central Limit Theorem"

The LLN only says "the mean approaches the truth." It can't answer "how precise is it?" or "what's the shape of the error?" — that's the Central Limit Theorem's job. You need both.

❌ "The LLN works for any distribution"

It requires the mean to exist. For the Cauchy distribution, the mean is undefined, and the sample mean doesn't converge no matter how much data you collect.

// Shapes you'll meet again

Around the Law of Large Numbers, the same upper-bound formula and the same comparison with its "twin" theorem keep reappearing.

  • Chebyshev's upper bound shape: with σ²=4 and n=100, P(|X̄−μ|≥1) ≤ 4/(100×1) = 0.04. "Variance ÷ (n × ε²)" is the structure, and n in the denominator drains the bound as n grows
  • The question LLN answers: "Does the sample mean approach the population mean?" — LLN appears as the answer to that question. Its shape is "guaranteeing the destination"
  • How LLN and the Central Limit Theorem split the work: "the value of the mean approaches the truth" belongs to LLN; "the shape of the mean's distribution becomes normal" belongs to the Central Limit Theorem. The two roles appear together as a paired shape
UP NEXT —how to quantify uncertainty with finite n I3 Confidence Interval