Confidence Intervals — What 95% Really Means
"There's a 95% chance the true value is inside" — actually wrong. Stack a hundred intervals side by side and the real definition becomes visible.
Confidence Interval — What 95% Really Means
The 95% confidence interval is famously misunderstood.
It does NOT mean "the true value is inside with 95% probability". The correct reading:
"repeat this sampling many times, and ~95% of the resulting intervals will capture the true value".
The lab below brute-forces that intuition. Thin pink = the unlucky intervals that missed.
Once the pink share settles around ~5%, you've got it.
- Step 1: At 95% confidence, n=30, hit ▶ → pink (missed) intervals should be ~5% of the total.
- Step 2: Drop confidence to 80% and regenerate → more pink. Narrower net = more misses.
- Step 3: Back to 95%, set n to 200 → intervals get much tighter. The power of large samples.
- Step 4: Set n to 5 → intervals are huge. With few samples, you need a wide net to catch the truth.
// Formula used here
Each part
• x̄ (sample mean): center of the interval — our best single guess
• zα/2: width multiplier determined by confidence level (95% → 1.96, 99% → 2.576)
• σ/√n (standard error): the estimation wobble. Larger σ widens it; larger n shrinks it
How it fits together
• "center ± multiplier × wobble" casts a net around the truth
• Quadruple n → √n doubles → interval width halves. Precision costs data
• Raising confidence increases the multiplier: wider net, but more likely to catch the true value
What if σ is unknown?
The formula above assumes the population standard deviation σ is known. In practice, σ is almost never known — you replace it with the sample standard deviation s and swap z for the t-distribution critical value. As n grows, the t-distribution approaches the standard normal, so the z-interval becomes a good approximation. Details are in the t-distribution section.
// Common misconceptions
The true value μ is a fixed constant — it's either inside the interval or it isn't. "95%" means "if we repeat this procedure 100 times, about 95 intervals will contain the truth." It's a statement about the method's long-run performance, not about one particular interval. In the simulation above, generate 300 intervals and watch the pink ones (misses) hover near 5%.
Narrow means "precise," not "right." Lower the confidence level and the interval shrinks — but the chance of missing the true value grows. Narrowness reflects precision, not accuracy.
You can build confidence intervals for variances, proportions, regression coefficients, differences — any estimator. The structure is always the same: "estimate ± multiplier × standard error."
// Shapes you'll meet again
Confidence intervals keep rearranging the same set of parts.
- The "center ± multiplier × standard error" shape: with x̄=50, σ=10, n=25, the 95% CI is 50 ± 1.96×(10/5) = [46.08, 53.92]. The numbers vary, the three-layer structure doesn't
- How width and n relate: width shrinks as 1/√n. "Halve the width → quadruple n" is this proportional shape in disguise
- The CI–test twin relationship: when 95% CI = [2.1, 4.3], μ₀ = 2 sits outside, and the two-sided test at α=0.05 lands on "reject." The two views are flip sides of the same decision
- Confidence level mirrors the multiplier: switching 95% → 99% shifts the multiplier 1.96 → 2.576, and the wider interval follows directly
See where z = 1.96 and 2.576 come from in Interactive Distribution Tables (two-sided mode)