Simple Linear Regression — OLS Visualized

Drop one extra point and the line jumps. Once least-squares becomes geometry, β₁ and R² stop reading like cryptography.

M.01 / SIMPLE REGRESSION

Simple Regression — Drawing the OLS Line

Up to here, one variable at a time. Real questions involve relationships — height vs. weight, ad spend vs. sales. Simple regression draws one line through two variables — and the t-tests and CIs you just learned power the inference on its slope β̂.

Regression with just one explanatory variable is simple regression. It assumes a linear relationship: when x increases by 1, y moves by β₁. Ordinary least squares (OLS) picks the line that minimizes the sum of squared vertical residuals. Click the canvas to add points and watch the line snap into place. Green bars are residuals. R² (in 0–1) measures how much of y the line explains.

Experiment Guide — try these in order

Step 1: Hit "Random 20 pts" → a regression line and R² appear. Check the green bars (residuals).
Step 2: Click far from the line to add one outlier → the line jerks toward it. Watch how far a single point can drag the fit.
Step 3: CLEAR and place 5 points nearly in a line → R² ≈ 1.0. A perfect linear relationship.
Step 4: CLEAR and arrange points in a circle → R² ≈ 0. A line can't capture this pattern.

↑ Click the canvas to add points

Slope β₁—

Intercept β₀—

R²—

Correlation r—

// Formula used here

ŷ = β₀ + β₁x (prediction equation)
• β₁ (slope): average change in y per 1-unit increase in x
• β₀ (intercept): predicted y when x = 0 — where the line crosses the y-axis

Breaking down the β₁ formula
• Numerator Σ(xᵢ − x̄)(yᵢ − ȳ): co-variation — how much x and y move together. Positive when both above/below their means
• Denominator Σ(xᵢ − x̄)²: total variation in x alone
• The ratio: "co-movement per unit of x-spread" = the slope
• More spread in x → more stable slope estimate (larger denominator = less noise)

// Common misconceptions

❌ "High correlation = causation"

Ice cream sales and drowning deaths are strongly correlated, but ice cream doesn't cause drowning. The shared cause is temperature (confounding variable). Correlation says "they co-move"; it doesn't say "one causes the other."

❌ "High R² means the model is correct"

R² always increases when you add more variables — even irrelevant ones. So R² alone can't judge model quality. Use the adjusted R² instead.

❌ "Fitting a line means there is a linear relationship"

Least squares always produces a line — even for data scattered in a circle. In the simulator above, try "random 20 points" and note the low R². The line exists; the relationship doesn't.

// Shapes you'll meet again

Around simple regression, the slope's construction and R²'s interpretation always travel together.

How β₁ is assembled: β₁ = Σ(x−x̄)(y−ȳ) / Σ(x−x̄)² = co-variation / x-variation. "More spread in x makes the denominator larger and the slope estimate steadier" is woven into the shape
How R² is read: with R² = 0.64, the reading "64% of y's variance is explained by x" appears in this shape
r and R² correspondence: R² = 0.64 ⇒ r = ±0.8. The sign matches β₁'s direction — that's the consistent shape between them
The three residual conditions: homoscedasticity, normality, independence. These three need to be in place for the least-squares picture above to back up the intervals and tests built on it

// Further reading

How Far Can Statistics Predict Your Income? Reading income ranges from age, gender, and prefecture — multiple regression as a story

« See all columns

UP NEXT —controlling for everything else ▸ M2 Multiple Regression