Skip to main content

Confidence Intervals — What 95% Really Means

"There's a 95% chance the true value is inside" — actually wrong. Stack a hundred intervals side by side and the real definition becomes visible.

I.03 / CONFIDENCE INTERVAL

Confidence Interval — What 95% Really Means

LLN says "at infinity, you're right." But in practice we always have a finite sample. So instead of a single point, drape a net around it — that's a confidence interval. Wider net, easier to catch; narrower, more precise. Watch the trade-off play out.

The 95% confidence interval is famously misunderstood.
It does NOT mean "the true value is inside with 95% probability". The correct reading: "repeat this sampling many times, and ~95% of the resulting intervals will capture the true value".
The lab below brute-forces that intuition. Thin pink = the unlucky intervals that missed. Once the pink share settles around ~5%, you've got it.

Experiment Guide — try these in order
  1. Step 1: At 95% confidence, n=30, hit ▶ → pink (missed) intervals should be ~5% of the total.
  2. Step 2: Drop confidence to 80% and regenerate → more pink. Narrower net = more misses.
  3. Step 3: Back to 95%, set n to 200 → intervals get much tighter. The power of large samples.
  4. Step 4: Set n to 5 → intervals are huge. With few samples, you need a wide net to catch the truth.
Intervals built0
Coverage
Expected95%

// Formula used here

Each part
• x̄ (sample mean): center of the interval — our best single guess
• zα/2: width multiplier determined by confidence level (95% → 1.96, 99% → 2.576)
• σ/√n (standard error): the estimation wobble. Larger σ widens it; larger n shrinks it

How it fits together
• "center ± multiplier × wobble" casts a net around the truth
• Quadruple n → √n doubles → interval width halves. Precision costs data
• Raising confidence increases the multiplier: wider net, but more likely to catch the true value

What if σ is unknown?
The formula above assumes the population standard deviation σ is known. In practice, σ is almost never known — you replace it with the sample standard deviation s and swap z for the t-distribution critical value. As n grows, the t-distribution approaches the standard normal, so the z-interval becomes a good approximation. Details are in the t-distribution section.

// Common misconceptions

❌ "95% CI = 95% probability the true value is inside" — the most common trap

The true value μ is a fixed constant — it's either inside the interval or it isn't. "95%" means "if we repeat this procedure 100 times, about 95 intervals will contain the truth." It's a statement about the method's long-run performance, not about one particular interval. In the simulation above, generate 300 intervals and watch the pink ones (misses) hover near 5%.

❌ "A narrow interval means a correct answer"

Narrow means "precise," not "right." Lower the confidence level and the interval shrinks — but the chance of missing the true value grows. Narrowness reflects precision, not accuracy.

❌ "Confidence intervals only work for means"

You can build confidence intervals for variances, proportions, regression coefficients, differences — any estimator. The structure is always the same: "estimate ± multiplier × standard error."

// Shapes you'll meet again

Confidence intervals keep rearranging the same set of parts.

  • The "center ± multiplier × standard error" shape: with x̄=50, σ=10, n=25, the 95% CI is 50 ± 1.96×(10/5) = [46.08, 53.92]. The numbers vary, the three-layer structure doesn't
  • How width and n relate: width shrinks as 1/√n. "Halve the width → quadruple n" is this proportional shape in disguise
  • The CI–test twin relationship: when 95% CI = [2.1, 4.3], μ₀ = 2 sits outside, and the two-sided test at α=0.05 lands on "reject." The two views are flip sides of the same decision
  • Confidence level mirrors the multiplier: switching 95% → 99% shifts the multiplier 1.96 → 2.576, and the wider interval follows directly

See where z = 1.96 and 2.576 come from in Interactive Distribution Tables (two-sided mode)

UP NEXT —from width to yes/no I4 Hypothesis Testing