Multiple Regression — Control Confounders to Find True Effects

"More study time means lower scores" — a result that flips the moment you add one more variable. That moment is what confounding looks like.

M2 / Multiple Regression

Multiple Regression — Predict with multiple variables

Simple regression predicted scores from study hours alone. But what if heavy studiers sleep less — and lost sleep drags scores down? The true effect of studying gets masked by a hidden variable. Multiple regression controls for other factors to isolate each variable's real contribution.

One variable gives a line; two give a plane in 3D. But the point isn't geometry — it's removing confounding to isolate each variable's true effect. Start with the side-by-side comparison to see the moment β₁ shifts.

Experiment Guide — Feel Confounding

Step 1: At default (corr = −0.5), compare β₁ left vs. right. Simple regression is smaller — the study effect is underestimated.
Step 2: Set correlation to 0 → both β₁ values nearly match. "No confounding, no bias."
Step 3: Set correlation to +0.5 → now simple regression β₁ is too large. Confounding can bias in either direction.
Step 4: Check R² too. Multiple regression is always ≥ simple — adding a variable improved explanatory power.

▶ Simple vs. Multiple Regression — Watch β₁ Shift

Same data, two models: "study hours only" vs. "study + sleep." Adjust the correlation slider to change confounding strength.

Study↔Sleep correlation = −0.50

Sample size n = 50

Simple β₁ —

Multiple β₁ —

Gap (confounding bias) —

Simple R² —

Multiple R² —

Experiment Guide — Feel the Regression Plane in 3D

Step 1: Drag the 3D plot to rotate. The translucent surface is the regression plane — data points align along it.
Step 2: Move the study slider → the prediction dot slides along the x₁ direction. The tilt = β₁.
Step 3: Move sleep too → it moves along x₂. The tilt = β₂. Each variable's contribution is visible.
Step 4: Hit Resample a few times → β₁, β₂, R² shift slightly each time. Estimates have variability too.

Study hours x₁ = 5.0h

Sleep hours x₂ = 7.0h

—

β̂₁ (per study hour)—

β̂₂ (per sleep hour)—

R²—

📊 Column: Feel multiple regression through income prediction →

// What's happening with confounding?

In the comparison panel above, simple and multiple regression gave different β₁ values for the same data.
Why does adding one variable change the slope?

Here's what's happening behind the scenes:

More study → less sleep → lower scores (indirect negative effect)

Simple regression β₁ mixes "direct effect + indirect effect."
The true direct effect is +3 pts/h, but lost sleep drags scores down, so simple regression underestimates at ~+2.2.

Multiple regression β₁ holds sleep constant and extracts only studying's direct effect — that's what partial regression coefficient means.

Set correlation to 0 in the panel above and the indirect path vanishes — β₁ values converge. That's the "no confounding" state.

// Formula used here

Each part
• β₀ (intercept): baseline prediction when all variables = 0
• β₁x₁: "holding x₂ fixed, how much does y change per unit of x₁" × the value of x₁
• β₂x₂: same idea — the isolated contribution of x₂

How it differs from simple regression
• Simple regression β₁ = direct effect of x₁ + indirect effect via x₂ (mixed together)
• Multiple regression β₁ = direct effect of x₁ only (x₂ is statistically "held constant")
• Set correlation to 0 in the panel above and the indirect path vanishes — that's why both β₁ values converge

Geometric picture
• With 2 predictors, the fit is a plane in 3D. Least squares picks the tilt that minimizes total squared distance from points to plane

// Worked example — try it by hand

Predicting test scores for 30 students using study hours and sleep hours.

① Check the averages
　Mean study = 5.0h, mean sleep = 7.0h, mean score = 65 pts
　Correlation between study and sleep: r = −0.45 (heavy studiers sleep less)

② Simple regression y ~ x₁ (study hours only)
　β₁ = +2.4 pts/h, R² = 0.32
　→ Each study hour adds 2.4 pts... but this underestimates the true effect

③ Multiple regression y ~ x₁ + x₂ (add sleep hours)
　β₁ = +3.1 pts/h, β₂ = +2.0 pts/h, R² = 0.57
　→ Controlling for sleep, study effect rises to +3.1 pts/h

④ Why the change?
　In this data, study↑ → sleep↓ (r = −0.45).
　Simple regression blamed studying for the negative impact of lost sleep.
　Multiple regression separated sleep out, revealing studying's true effect.

⑤ Make a prediction
　β₀ = ȳ − β₁x̄₁ − β₂x̄₂ = 65 − 3.1×5 − 2.0×7 = 35.5 (OLS always passes through the means).
　Student with 6h study, 7h sleep → ŷ = 35.5 + 3.1×6 + 2.0×7 = 35.5 + 18.6 + 14.0 = 68.1 pts
　β₀ is the "score at 0h study, 0h sleep" — a hypothetical baseline with no intuitive meaning (don't extrapolate).

// Common misconceptions

❌ "The largest partial coefficient is the most important variable"

"Study hours (0–10)" and "sleep hours (4–10)" are on different scales. Comparing raw coefficients is meaningless. Use standardized coefficients (both variables rescaled to SD = 1) to compare importance.

❌ "More predictors are always better"

R² always increases with more variables, but the adjusted R² can decrease. Irrelevant variables add noise and destabilize estimates. In the simulator above, a coefficient near zero hints that the variable may be unnecessary.

❌ "Multicollinearity doesn't matter"

When two predictors are highly correlated — like "study hours" and "library hours" — individual coefficients become erratic (signs can even flip). Watch for VIF > 10 as a warning sign.

// Shapes you'll meet again

Around multiple regression, the same coefficient-interpretation and test-comparison shapes keep returning.

The partial-coefficient picture: with β₁ = 3, the reading "holding x₂ constant, raising x₁ by 1 lifts y by 3 on average" appears. "Hold the others fixed" is the conditional clause that travels with this picture
When β₁ shifts between simple and multiple regression: with correlated predictors, simple regression fuses direct and indirect effects into one number. Setting correlation to 0 in the panel above brings the two estimates back into alignment
R² vs. adjusted R² as a pair: adding a variable always raises R², but adjusted R² can fall. The shape "if adjusted drops, the new variable wasn't worth its cost" lives here
How F-tests and t-tests divide the labor: the F-test takes "all partial coefficients = 0" at once; the t-test asks "is this one coefficient = 0?" — the model-wide and per-coefficient questions split along these two shapes
The multicollinearity signal: VIF > 10 appears as the rule-of-thumb threshold, foreshadowing scenes where coefficients become erratic

// Further reading

How Far Can Statistics Predict Your Income? Reading income ranges from age, gender, and prefecture — multiple regression as a story

« See all columns