What is standardization in statistics?

Standardization is the transformation Z=(X-μ)/σ: subtract the mean and divide by the standard deviation. It converts any data into 'how many standard deviations from the mean,' placing everything on a common, unit-free scale.

What is the difference between a z-score and a percentile?

A z-score tells you how many standard deviations a value is from the mean (e.g., z=+1.5 means 1.5σ above average). A percentile tells you what percentage of the population falls below that value. They are related through the normal distribution: z=+1.0 corresponds to roughly the 84th percentile.

Why does standardization make comparison possible?

Standardization strips away the original units (cm, dollars, points) and converts everything into a dimensionless ratio: 'σ units from the mean.' This allows you to compare inherently different measurements — like height and income — on a single scale of 'how unusual is this?'

Where is standardization used?

Standardization is used in hypothesis testing (z-tests, t-tests), machine learning preprocessing (feature scaling), comparing different metrics, and anywhere you need to place data on a common scale. It is fundamental to nearly all of inferential statistics.

What Is Standardization? — The Universal Translator for "Normal"

A height of 5'3" and a salary of $50,000 — which is further from "normal"?
Comparing inches to dollars sounds meaningless at first glance.
But statistics has a universal translator that puts any number on the same scale.
That translator is called "standardization."

// Comparing the incomparable

Imagine someone shares two facts:

"Six feet tall."
"Scored 720 on the SAT Math."

Both sound impressive.
But "which one is rarer?" can't be answered as-is.
Why? Because —

Height is measured in inches, SAT scores in points
The average height is about 5'9" (69 in), while the average SAT Math is about 530
Height has low variability, SAT scores have high variability

Units, averages, and spreads are all different. Apples and oranges.
Turning "incomparable" into "comparable" — that's standardization's job.

// The translator revealed

The formula for standardization is just this:

Z = (X − μ) ÷ σ

It works in just two steps.

Step 1: Subtract the mean
(X − μ) gives "how far from the average."
This shifts the baseline to zero.

Step 2: Divide by the standard deviation
Dividing by σ strips away the units — inches, points, dollars all vanish.
What remains is a dimensionless number: "how many σ."

These two steps translate any data into a common language of "mean 0, standard deviation 1."
The translated value is called a z-score (drag the N(0,1) curve to confirm).

// Let's translate

Let's standardize "6 feet tall" and "SAT Math 720" right now.

▼ Height (adult males, US)

Value X 72in

−

Mean μ 69in

Std Dev σ 2.8in

Z-score +1.07

▼ SAT Math

Value X 720pts

−

Mean μ 530pts

Std Dev σ 110pts

Z-score +1.73

A height of 6'0" is 1.07σ above the mean.
An SAT Math 720 is 1.73σ above the mean.
→ Both are impressive, but statistically the SAT score is rarer.

Inches and points are gone. All that's left is "how many σ."
That's the power of standardization — apples and oranges, on the same scale.

// Why this is actually deep

Standardization is more than a calculation trick.
Think about it for a moment.

What is "normal"?
Standardization implicitly defines "the mean is normal, and distance from it is rarity."
In other words, standardization gives us a mathematical definition of "normal."

Units vanish
When you divide inches by σ (also in inches), the inches cancel out.
What remains is a pure ratio — a "dimensionless quantity" in physics terms.
Standardization is a distillery that extracts only the meaning from data.

The common language of statistics
z-tests, t-tests, confidence intervals, effect sizes —
most of the important tools in statistics use standardization under the hood.
Once you understand standardization, you'll see they're all doing the same thing.

// Z-scores and percentiles

In everyday language we often say "top 5%" or "84th percentile."
These are just z-scores with a different coat of paint.

If data follows a normal distribution, every z-score maps to a percentile:

	Z-score	Percentile (approx.)
Mean	0	50th
+1σ	+1.0	84th
+2σ	+2.0	98th
-1σ	-1.0	16th
-2σ	-2.0	2nd

A z-score of +2 means you're in the top 2%. A z-score of -1 means about 84% of people scored higher.
The 68-95-99.7 rule is just this table in disguise.

// Quick check — 3 questions

Q1. After standardization, what happens to the original units (inches, points, dollars)?

The standard deviation has the same units as the data (e.g., inches). Division cancels the units, leaving the z-score as a dimensionless ratio: "how many σ."

Q2. Z = -0.5 means "above average" or "below average"?

A negative z-score means below the mean. Z = -0.5 means "half a standard deviation below the average."

Q3. If a value is 2 standard deviations above the mean, what is its z-score?

The z-score IS "how many standard deviations from the mean." Two standard deviations above = Z = +2.0. Under normal distribution, that's roughly the 98th percentile.

// KEY TAKEAWAY

Standardization Z=(X−μ)/σ translates any data into "how many σ from the mean"
Subtract the mean to center at zero, divide by σ to erase units — just two steps
Z-scores map directly to percentiles under a normal distribution. Same idea, different label
z-tests, t-tests, confidence intervals — the core tools of statistics all run on standardization under the hood

FAQ

// Frequently asked questions

Z = (X − μ) ÷ σ. Subtract the mean and divide by the standard deviation. It converts data into "how many standard deviations from the mean" — a common, unit-free scale.

A z-score tells you how many standard deviations a value is from the mean (e.g., z = +1.5 means 1.5σ above average). A percentile tells you what share of the population falls below that value. Under a normal distribution, z = +1.0 ≈ 84th percentile.

Standardization strips away original units (inches, dollars, points) and converts everything into a dimensionless ratio: "σ units from the mean." This lets you compare inherently different measurements — like height and test scores — on a single scale of "how unusual is this?"

Hypothesis testing (z-tests, t-tests), machine learning preprocessing (feature scaling), effect sizes (the standardized mean difference d), and comparing different metrics. It is fundamental to nearly all of inferential statistics.

No. Standardization only shifts (subtract μ) and rescales (divide by σ), so the shape of the distribution is perfectly preserved. A skewed distribution stays skewed after standardization. A normal distribution becomes the standard normal N(0,1).

Standardization is the Rosetta Stone of statistics.

Now take this common language and feel how probability calculations and hypothesis tests actually work.
StatPlay's interactive tools let you touch what lives beyond the formulas.

// Try it live

Watch z = (X−μ)/σ cross unit boundaries in real time Slide N(0,1) yourself and watch z = (X − μ)/σ collapse different units onto a single scale

// Further reading

What Is Hensachi? — Japan's School Score Is Just a Rescaled z-Score follows the z-score as it descends into Japan's everyday school score, hensachi — the sister column Standard Deviation vs Standard Error — Telling SD and SE Apart in One Picture if z = (X−μ)/σ is the horizontal translator across units, SE is the vertical translator from individuals to means — the paired sister column

« See all columns

A height of 5'3" and a salary of $50k.Which is further from "normal"?