See the Central Limit Theorem in Action
However twisted the population, its sample mean still bends back into a bell. A hidden engine that keeps reappearing behind the rest of inferential statistics.
Central Limit Theorem — Why Normal Is King
A fact worth pausing on — no matter how skewed the base distribution is,
if you take n samples and average, then repeat, the distribution of those averages
converges on its own to a bell (normal).
The lab below shows left = the raw skewed source side-by-side with right = the sample-mean distribution,
so you can watch the bell emerge. Crank n up and the bell tightens (SE = σ/√n).
- Step 1: Choose "Exponential" and set n=1, hit ▶ → still heavily skewed. Not a bell at all.
- Step 2: Set n to 5 and run → starting to look bell-ish, but still skewed.
- Step 3: Set n to 30 and run → nearly normal. "n≥30" is a practical rule of thumb, not a theorem — heavily skewed distributions may need more.
- Step 4: Switch to "Bimodal" and repeat → even a two-peaked distribution morphs into a bell. Worth watching twice.
// Formula used here
What this says in plain English
• "Take the average of n test scores, subtract the true mean, divide by the standard error — and no matter what the original scores' distribution looks like, as n grows, this ratio follows a standard normal distribution"
Each part
• X̄ₙ − μ: the gap between the sample mean and the true mean — raw units make this hard to interpret
• ÷ σ/√n: standardize the gap by the standard error (SE) → converts to "how many SEs off"
• →d N(0,1): as n → ∞, this quantity's distribution converges to the standard normal
Why this unlocks confidence intervals and hypothesis tests
• Thanks to the Central Limit Theorem, we can say "X̄ is approximately N(μ, σ²/n)"
• The confidence interval X̄ ± 1.96·σ/√n works because the Central Limit Theorem guarantees ±1.96σ covers 95%
• The test statistic z = (X̄−μ₀)/(σ/√n) can be judged against the N(0,1) table — again, thanks to the Central Limit Theorem
• In short, the Central Limit Theorem is "the reason normal-distribution tools work on everything"
// Why does averaging produce a bell curve?
Intuitively — when you average n values, individual outliers cancel out. High values offset low values, making extreme averages rare. The pattern of "how rare" matches the bell curve exactly.
Try it in the simulator above:
• Pick the exponential distribution (heavily right-skewed), set n=5 → the mean distribution is still skewed
• Set n=30 → already quite bell-shaped. No matter how skewed the original, averaging brings it toward normal
// Is "n ≥ 30" really enough?
Textbooks often say "n ≥ 30 is sufficient for the Central Limit Theorem." This isn't a mathematical threshold — it's a rule of thumb.
- Uniform distribution (symmetric): n = 12 already looks very normal
- Exponential (strong skew): n = 40–50 needed
- Bimodal: n = 30 may still show bumps
// σ vs. SE — easy to confuse
- σ (standard deviation): spread of individual data points. A property of the data itself
- SE = σ/√n (standard error): spread of the sample mean. Depends on sample size
- Quadruple n → SE halves (√4 = 2). Doubling precision requires 4× the data — that's the true "cost of research"
// Common misconceptions
The original data stays exactly the same shape. What becomes normal is the distribution of the sample mean. Data drawn from an exponential distribution is still right-skewed — only the average of many such draws approaches a bell.
As noted above, it depends on the parent distribution. The simulator lets you test different distributions and sample sizes to see for yourself.
The Central Limit Theorem requires finite variance (σ² < ∞). For the Cauchy distribution (which has infinite variance), the sample mean doesn't converge to normal no matter how large n gets.
// Shapes you'll meet again
The Central Limit Theorem isn't a topic that ends here — it's the silent machinery underneath several other topics. The same √n structure keeps reappearing, just dressed in different clothes.
- Reappears in the width of a confidence interval: the CI for a mean is X̄ ± 1.96·σ/√n. The width scales with σ/√n precisely because the Central Limit Theorem guarantees X̄ is close to N(μ, σ²/n)
- Reappears in proportion tests: the SE of the one-sample z-test, √(p₀(1−p₀)/n), is the same "spread divided by √n" pattern in disguise. The proptest condition np ≥ 5 is the Central Limit Theorem's threshold translated into the world of proportions
- Reappears in the SE of a regression slope: SE(β̂₁) divides the residual standard deviation by "spread of x times √n." The reason adding observations tightens the slope estimate is, again, the √n in the denominator
- "Quadruple n to double the precision": with √n sitting in every denominator, precision can only grow as the square root of sample size. That's the shared "cost of measurement" across every inference shape
The interval width that sits on the CLT guarantee is at Confidence Intervals.