The Three Test Distributions — t, χ², and F
t for means, χ² for variances, F for variance ratios — three siblings born from the standard normal. Slide the degrees of freedom and the family resemblance comes into focus.
Three Test Distributions — Meet t, χ² & F
t, χ², F are all derived from the normal. Think of them as "the standard normal, scaled to reflect that we only ever see a sample".
Use them for: t — testing a mean when the population variance is unknown (i.e. nearly every real test of a mean);
χ² — testing a variance, independence, goodness-of-fit for categorical data;
F — ratios of variances (ANOVA, the overall F in regression).
Slide df: t converges to N(0,1) as df→∞, and χ²/F get more symmetric with more df. The Central Limit Theorem is quietly doing the work under the hood.
- Step 1: n=3 → df=2, extremely heavy tails (nearly Cauchy).
- Step 2: n=10 → heavier tails than normal, but getting closer.
- Step 3: n=31 → nearly indistinguishable from N(0,1). The CI bars nearly overlap.
- Step A: n=5, 95% confidence → the t bar is much wider (normal is overconfident).
- Step B: Set n=50 → the bars nearly overlap at every confidence level.
- Step C: Switch to 99% → the gap widens further (especially for small n).
▶ t distribution
Use for: testing means with unknown variance, regression t-values.
Flavor: heavier tails than N(0,1); matches N(0,1) as df→∞.
With small samples, a normal-based CI is too narrow — overconfident. The t-distribution honestly reflects that extra uncertainty. As n grows, t converges to normal — that's what the two bars show.
- Step 1: Keep "Fair die" selected and press Roll a few times → even a fair die varies each time. Watch the bar chart and χ² statistic change.
- Step 2: Switch to "Loaded" and roll → face 1 jumps out, χ² enters the rejection region.
- Step 3: Increase the rolls slider → more samples detect smaller biases (higher power).
- Step A: Set df to 1 → sharply right-skewed.
- Step B: df=30 → approaches a bell shape. Confirm mean = df.
- Step C: Move df and watch how the normal approximation (purple dashes) lines up.
▶ χ² distribution
Use for: variance tests, chi-square tests of independence / goodness-of-fit.
Flavor: non-negative, right-skewed. Mean = k, variance = 2k. Goes bell-shaped with large df.
The chi-squared distribution measures "how big is the gap between observed and expected." It tells you whether that gap is just random noise or a real bias.
- Step 1: Set both SDs equal (e.g. 10 and 10) → F≈1, stays outside rejection zone.
- Step 2: Increase only B's SD (e.g. A=5, B=15) → F grows and enters the rejection zone.
- Step 3: Increase n → the same SD gap gives a lower p-value (more power).
- Step A: n=6 → heavy tails, unstable shape.
- Step B: n=50 → approaches a bell. Check the F=1 line.
- Step C: Swap A and B's SD values to feel what "variance ratio" means.
▶ F distribution
Use for: ANOVA, overall F-test in regression.
Flavor: non-negative, right-skewed. Shape depends on both df.
The F-distribution evaluates the ratio of two groups' spread. ANOVA also uses this F-statistic to test whether group means differ.
// Formula used here
All three are children of N(0,1)
• Z, Z₁, Z₂, … are all plain standard normal variables. These three distributions are built entirely from them
χ² distribution: square Z's and add them up → "total spread"
• Z₁² + Z₂² + … + Zₖ² follows χ²(k). Since we're squaring, the values are always positive and the shape is right-skewed
• As k grows the shape becomes more symmetric (a consequence of the Central Limit Theorem!)
t distribution: Z divided by an estimated σ → "normal when σ is unknown"
• t = Z / √(χ²/k). Think of χ²/k in the denominator as an estimate of σ²
• Small k (little data) → unstable σ estimate → fatter tails
• k → ∞: denominator stabilizes at 1, and t becomes exactly Z (converges to standard normal)
F distribution: ratio of two χ²'s → "ratio of two variances"
• Used to test whether group A and group B have the same variance
• Numerator and denominator have different df's — swapping them gives a different distribution (careful!)
// What is degrees of freedom (df), really?
The name itself rarely conveys the meaning. One way to read it: "the number of data values free to vary."
Example: three test scores with a known mean of 60.
• Student 1: 70 (free)
• Student 2: 55 (free)
• Student 3: ? → 60×3 − 70 − 55 = 55 (forced)
→ 3 students but only 2 free → df = n−1 = 2
What happens when df is small:
• Less information → less stable estimates → wider tails (extreme values more likely)
• t distribution at df = 1 has very fat tails; at df = 30 it's nearly indistinguishable from normal
• Move the df slider in the simulator above to see this in action
// When to use which? Quick reference
| Goal | σ known? | Distribution |
|---|---|---|
| Test a mean | Yes (rare) | z (standard normal) |
| Test a mean | No (usual case) | t distribution |
| Test a single variance | — | χ² distribution |
| Compare two variances | — | F distribution |
| Test regression coefficient | — | t distribution |
| Test overall regression model | — | F distribution |
| Goodness of fit / independence | — | χ² distribution |
Once you know which distribution to use, look up critical values in the Interactive Distribution Tables
// Common misconceptions
The t distribution is "the normal when you have to estimate σ." As df → ∞ it converges to N(0,1). The only real difference is fatter tails.
Almost always n−1. Each parameter you estimate costs one degree of freedom. Goodness of fit: k−1; independence test: (r−1)(c−1).
The F distribution is asymmetric. Mix up numerator and denominator df and you get a different critical value — and potentially a different conclusion.
// Shapes you'll meet again
The t, χ², and F families show up in linked, recurring shapes.
- How df slides into place: a one-sample t-test with n = 20 gives df = n−1 = 19. The shape "estimate one mean, lose one degree of freedom" appears here
- t distribution's limit shape: push df up and the t curve melts into N(0,1). Moving the slider above shows that overlap as an image
- How χ²(n−1) enters: the form (n−1)S²/σ² ~ χ²(n−1) keeps reappearing in variance tests and intervals
- The link across families: t² and F(1, k) coincide. The three distributions aren't separate creatures — they're a family descended from the same N(0,1)