Proportion Testing & Estimation — Visualize the z-test for Proportions
Estimation uses p̂; testing uses p₀. A single character swaps the standard-error formula — and the whole proportion test pivots on that one letter.
Proportion Test & Estimation — from sample proportion to the truth
A world where every data point is just "success" or "failure." The sample proportion p̂ = x/n is the starting point, and when n is large enough the normal approximation kicks in (thanks to the Central Limit Theorem).
We'll work through ① interval estimation → ② one-sample test → ③ two-sample test. Compare each step with what you learned for means — spotting the similarities makes the differences easy to absorb.
- Step 1: n=15, p̂=0.50, 95% → the interval is really wide. Just 15 people isn't enough precision.
- Step 2: Push n to 100 → the interval tightens up fast. Feel the power of sample size.
- Step 3: Set p̂ to 0.90 → variance p(1−p) shrinks, so the CI narrows. p̂=0.50 gives the widest interval.
- Step 4: Keep p̂=0.90, drop n to 10 → if ⚠ appears, the normal approximation conditions aren't met.
▶ ① Confidence interval for a proportion
"What does a 95% CI actually mean?" — one of the easier spots to stall on. The answer: "if you repeated the same survey many times, about 95% of the intervals would capture the true proportion."
Run it 200 times and see whether that 95% claim holds up.
▶ ①-b CI simulation
- Step 1: n=100, p̂=0.60, p₀=0.50, α=0.05, two-sided → "Is 60% significantly different from 50%?"
α (significance level) = the threshold for "too extreme to be coincidence." It determines the size of the rejection region. - Step 2: Change p₀ to 0.55 → z shrinks and you can no longer reject. Small differences are hard to detect.
- Step 3: Increase n to 400 → same p̂=0.60, p₀=0.55 but now you reject. The power of sample size.
- Step 4: Switch test type to "Right" → one-sided test asking only "greater than 50%?" The p-value halves.
▶ ② One-sample z-test for a proportion
Same flow as testing a population mean. Set up H₀: p = p₀, compute z from the sample, and check whether it lands in the rejection region.
The only twist is that the standard error becomes √(p₀(1−p₀)/n). Nail that, and the rest is familiar.
- Step 1: n₁=n₂=100, p̂₁=0.60, p̂₂=0.45 → is the difference significant?
- Step 2: Move p̂₂ toward 0.55 → z shrinks, rejection gets harder.
- Step 3: Increase both to n₁=n₂=400 → same gap, more power.
- Step 4: Try n₁=50, n₂=200 (asymmetric) → the smaller n is the bottleneck.
▶ ③ Two-proportion z-test
"Drug A vs. Drug B — which works better?" "Ad A vs. Ad B — is the click-through rate really different?" This test is for comparing two groups.
The key idea: under H₀: p₁ = p₂, we pool both samples into a single pooled proportion to build a shared SE. Get that, and the rest mirrors the one-sample test.
// Formulas used here
★ Estimation and testing put different content inside SE
Estimation uses SE = √(p̂(1−p̂)/n); testing uses SE = √(p₀(1−p₀)/n). The label "SE of a proportion" is shared, but estimation grounds the SE in the sample world (your data is the reference) while testing grounds it in the H₀ world (the hypothesized proportion is the reference). Compare the denominators of the first two formulas below — that's the only line that changes.
Confidence interval for a proportion (Wald interval)
· p̂ = x/n: sample proportion (x successes out of n trials)
· zα/2: upper α/2 quantile of the standard normal. For a 95% CI, α = 0.05 and z0.025 = 1.96
· √(p̂(1−p̂)/n): standard error of the sample proportion. Similar to SE = σ/√n for means, but the variance is p(1−p) — that's what makes proportions special
One-sample z-test for a proportion
· p₀: the proportion assumed under the null hypothesis (e.g., p₀ = 0.5 for "50%")
· The denominator uses p₀, not p̂ — because in a test we work inside the H₀ world
· Under H₀ this z follows a standard normal, which lets us compute the p-value
Two-proportion z-test
· p̂: the pooled proportion — both groups combined. Under H₀: p₁ = p₂ the true rates are the same, so pooling gives a better estimate
· The denominator is the SE of the difference. The 1/n₁ + 1/n₂ term means the smaller sample is the bottleneck
// Normal approximation conditions — when can you use this method?
The z-test and CI for proportions rely on the sample proportion following a normal distribution.
The rule of thumb for this approximation to hold:
- np ≥ 5 and n(1−p) ≥ 5 (use p₀ for tests)
- Intuitively: "both successes and failures should occur at least 5 times"
- When p is close to 0 or 1, the distribution becomes skewed and the normal approximation breaks down
If the conditions aren't met, use an exact binomial test or go back and collect more data.
In panel ① above, try shrinking n or pushing p̂ toward 0.01 — the ⚠ warning appears for exactly this reason.
// Estimation vs. testing — when to use which
- Interval estimation: "Roughly where is the true proportion?" → present a confidence interval
- Hypothesis test: "Is the true proportion different from p₀? Yes or no?" → decide via the p-value
- They are two sides of the same coin: p₀ outside the 95% CI ⟺ reject H₀ at α = 0.05
Flip between panels ① and ② with the same n and p̂ to see this relationship in action. If p₀ sits outside the CI in ①, panel ② will reject.
// Common misconceptions
For estimation, SE = √(p̂(1−p̂)/n) uses the sample proportion; for testing, SE = √(p₀(1−p₀)/n) uses the null value. The difference comes down to "which world are we basing our calculation on?"
If p is as extreme as 0.001, even n = 1,000 gives np = 1, which fails the condition. The "rule of 5" checks both p and n together.
Under H₀: p₁ = p₂, the true proportions are the same, so we pool them into a single estimate to build a shared SE. You compute separate SEs only when constructing a confidence interval for the difference.
// Shapes you'll meet again
Across proportion tests and intervals, the swap between the estimation SE and the testing SE — plus the normal-approximation check — keeps reappearing.
- The CI assembly shape: with 120 of 200 in favor, p̂ = 0.6, SE = √(0.6 × 0.4 / 200), and the 95% CI lands at 0.6 ± 1.96 × SE. The "center ± multiplier × standard error" three-layer structure carries straight over to proportions
- The test-statistic shape: with 35 defects in 500, p̂ = 0.07, p₀ = 0.05, z = (0.07 − 0.05) / √(0.05 × 0.95 / 500). The "H₀-world SE in the denominator" is the distinguishing detail against the estimation case
- The normal-approximation condition: np ≥ 5 and n(1−p) ≥ 5. With n = 20 and p₀ = 0.1, np₀ = 2 falls below the threshold — that's the shape this checkpoint takes
- The two-sample difference shape: scenarios like "Group A: 60/100 improved, Group B: 45/100 improved" carry a distinctive structure — pooled proportion sitting inside the SE in the denominator