StatPlay — 統計検定2級を直感で掴む可視化ラボ

▸ このトピックのページへ▸ Dedicated page for this topic

P.01 / STANDARD NORMAL

標準正規分布 — すべてのはじまり

Standard Normal — The Origin of Everything

統計のすべてを支配するあの曲線から始めよう——標準正規分布。CLT も仮説検定も信頼区間も、すべてここに帰ってくる。まずはベルカーブを触ってみよう。

Let's start with the curve that dominates all of statistics — the standard normal. CLT, hypothesis testing, confidence intervals — everything circles back here. Touch the bell curve first.

初めてこのページに来た方へ

前提知識は不要です。ここが統計学の出発点。「ベルカーブ」という言葉を聞いたことがあれば十分です。

New here?

No prerequisites. This is the starting point. If you've heard "bell curve," you're ready.

ぶっちゃけ この曲線ひとつがなければ、この先に出てくる検定も、信頼区間も、t分布も、回帰分析も、ぜんぶ成立しない。
標準正規分布 N(0, 1) は、平均0・標準偏差1のベル型カーブ。「どんな正規分布も z = (x − μ) / σ でここに重ねられる」という一行のトリックが、 100年前の統計学者たちに"紙の表ひとつで世界中の確率を計算する"力を与えた。
つまりこれは、統計のラスボスじゃなくて、起源（オリジン）。ここさえ掴めれば、残りのページは"標準正規の応用"として一気通貫で読める。

Honestly — without this single curve, none of what follows (tests, confidence intervals, the t-distribution, regression) would work.
The standard normal N(0, 1) is a bell curve with mean 0 and standard deviation 1. The one-line trick "z = (x − μ) / σ" lets every normal distribution collapse onto this same curve — and that's how a single paper table can compute probabilities for the entire world.
In other words, it's not the final boss of statistics; it's the origin. Once you own this, the rest of the page reads as "applications of the standard normal".

z = (x − μ) / σ , φ(z) = (1/√(2π)) · exp( −z²/2 )

z は「平均から標準偏差何個分離れたか」。φ(z) はその位置での曲線の高さ（確率密度）。exp は山型を作る関数で、覚えなくてOK。

z = "how many standard deviations from the mean." φ(z) = height of the curve there (the density). exp creates the bell shape — no need to memorize it.

▶ 「68 - 95 - 99.7」は暗記じゃなくて見て分かる

▶ "68 - 95 - 99.7" — no memorization, just see it

スライダーで幅 k を伸び縮みさせると、青く塗られた面積がそのまま"確率"。 ± 1σ ですでに約7割、± 2σ で95%、± 3σ でほぼ全部。
z = 1.96 という数字に見覚えがあれば、それは"両側5%"の臨界値。検定も信頼区間もこの 1.96 から出発する — それくらい、この曲線が主役なのだ。

Slide the width k; the blue-filled area IS the probability. ± 1σ already covers ~68%, ± 2σ is 95%, ± 3σ is nearly everything.
That famous number z = 1.96? It's the two-tail 5% critical value — hypothesis tests and confidence intervals all start there.

範囲 ±k σRange ±k σ = 1.0

1.0 なら平均から左右1σ分。1.96 にすると「両側5%」の臨界値。検定と信頼区間の出発点。

1.0 = ±1σ from the mean. Set to 1.96 for the famous "two-tail 5%" critical value.

P( |Z| ≤ k )—

P( Z ≤ k )—

外側の確率Outside prob.—

▶ 正規分布、ぜんぶ"あの一本"に化ける瞬間

▶ Watch every normal collapse onto "that one curve"

身長、IQ、株価の日次リターン、工場の部品誤差 — 世の中にある正規分布っぽいものは平均も広がりもバラバラ。でも z = (x − μ) / σ をかませるだけで、全部まとめてピンクのあの曲線にピタッと重なる。
スクロールしたら自動で変身していく（もう一度見たい時は ▶ ボタン）。これが、すべての統計公式が "標準正規表" 一枚で済む理由。

Height, IQ, daily stock returns, factory part errors — real-world normal-ish things all have different means and spreads. Yet apply z = (x − μ) / σ and they all snap onto that pink curve.
It auto-plays on scroll (▶ to replay). That's why every statistical formula needs only one standard-normal table.

μ = 2.0

この正規分布の中心位置。標準化すると 0 に移動する。

Center of this normal distribution. Standardization moves it to 0.

σ = 1.5

広がり具合。大きいほど山が平たく広がる。標準化すると 1 になる。

Spread. Larger = flatter bell. Standardization rescales it to 1.

進度Progress = 0

0% = 元のまま、100% = 標準化完了。途中経過を観察できる。

0% = original, 100% = fully standardized. Watch the transformation in slow motion.

元の分布OriginalN(2.0, 1.5²)

変換後の平均Transformed mean—

変換後のσTransformed σ—

▼ この先の展開
ここから先に出てくる 中心極限定理 は「どんな分布でも平均は標準正規に近づく」という宣言。 信頼区間も 仮説検定 も "±1.96σ" というこの曲線の数字を使う。 t分布・χ²・F分布 は標準正規の兄弟姉妹。回帰の係数推定の誤差も標準正規で近似する。
要するに、この 1 ページを押さえると、他が全部"応用問題"になる。楽しんで。

▼ What comes next
The Central Limit Theorem ahead says: "any distribution's mean approaches the standard normal". Confidence intervals and hypothesis tests all use the "±1.96σ" numbers from this curve. t, χ², F are its siblings. Even regression coefficient errors are approximated with the standard normal.
Short version: nail this one page and the rest becomes "applications". Have fun.

次は —正規分布そのものを扱う ▸ P.02 正規分布

UP NEXT —the normal as a tool ▸ P.02 Normal Distribution

▸ このトピックのページへ▸ Dedicated page for this topic

P.02 / NORMAL DISTRIBUTION

正規分布と標準化

Normal Distribution & Standardization

標準正規は μ=0, σ=1 に固定されていた。実際のデータは平均も散らばりも自由。μ と σ を動かして正規分布を道具として使いこなそう。標準化で Z に戻せば、どんな正規分布も標準正規と行き来できる。

The standard normal was fixed at μ=0, σ=1. Real data has any mean and spread. Move μ and σ to wield the normal as a tool. Standardization maps it back to Z, so any normal connects to the standard normal.

初めてこのページに来た方へ

標準正規分布（N(0,1)、平均0・標準偏差1のベルカーブ）を知っていると理解しやすいです。

New here?

Easier if you know the standard normal (N(0,1), bell curve with mean 0, sd 1).

さっきの標準正規の 一般バージョンが正規分布 N(μ, σ²)。 μ が位置（どこが真ん中か）、σ が広がり（どれくらい散らばるか）。スライダーを動かすと曲線がぬるっと動いて、指定した区間 [a, b] に入る確率（ピンクの面積）がリアルタイムで出る。
このピンクの面積こそ「割合」の正体。 たとえば成人男性の身長が N(170, 36)（平均170cm, σ=6cm）として、165〜175cm の人は全体の何%？ μ=170, σ=6 にして a=165, b=175 に合わせると 約 59.6%。偏差値、テストの点、測定誤差——だいたい正規で近似できるものは、ぜんぶこの面積計算で「〜%の人がこの範囲」が求まる。
Tip: グラフ上を直接ドラッグすると、近い方の a/b の境界を動かせる。

The general version of the standard normal is N(μ, σ²). μ sets the center, σ sets the spread. Slide the parameters and the curve glides; the probability of falling inside [a, b] (pink area) updates live.
That pink area IS the "percentage" you hear in the news. Say adult male heights are N(170, 36) (mean 170cm, σ=6cm). What share falls in 165–175cm? Set μ=170, σ=6, then a=165, b=175 — you get ≈ 59.6%. Test scores, measurement errors, IQ — anything roughly normal gets its "X% of people in this range" from exactly this area.
Tip: drag directly on the graph to move the a/b bounds — whichever handle is closest follows your finger.

f(x) = (1 / √(2πσ²)) · exp( −(x−μ)² / 2σ² )

さっきの φ(z) の一般版。μ が中心位置、σ が広がり。x が μ から離れるほど確率密度が下がる、と言っているだけ。

The general version of φ(z). μ sets the center, σ the spread. It just says: the farther x is from μ, the lower the density.

μ (平均)μ (mean) = 0

曲線の中心位置。身長なら170、テスト点なら60のように設定する。

Center of the curve. Set to 170 for height, 60 for test scores, etc.

σ (標準偏差)σ (std dev) = 1

広がり具合。大きいほどバラつきが大きい。σ=6 なら「ほとんどの人が ±12 の範囲」。

Spread. Larger = more variation. σ=6 means "most people within ±12."

区間 [a,b] : aInterval [a,b] : a = -1

確率を計算したい範囲の左端。b と合わせて「この範囲に入る割合」がピンクの面積で出る。

Left edge of the probability range. Together with b, the pink area shows the fraction inside.

b = 1

範囲の右端。a より大きい値に設定する。

Right edge of the range. Should be greater than a.

P(a ≤ X ≤ b)—

z-score (a)—

z-score (b)—

次は —確率の基本法則をベン図で ▸ P.03 確率の基本法則

UP NEXT —probability rules with Venn diagrams ▸ P.03 Probability Rules

▸ このトピックのページへ▸ Dedicated page for this topic

P.03 / PROBABILITY RULES

確率の基本法則 — ベン図で直感をつかむ

Probability Rules — Build Intuition with Venn Diagrams

正規分布の形を掴んだ。ここで基礎に立ち返る——すべてを支える確率の基本法則。加法定理、乗法定理、条件付き確率を面積で直感的に掴む。

You've got the shape of the normal. Now let's step back to the foundations — the probability rules that make all of it work. Addition, multiplication, and conditional probability visualized as overlapping areas.

初めてこのページに来た方へ

正規分布の知識は不要です。高校の「場合の数と確率」の延長で読めます。

New here?

No normal distribution knowledge needed. This extends high school probability.

加法定理、乗法定理、条件付き確率 — 公式を暗記する前に、面積で「見て」しまおう。
P(A∪B) = P(A) + P(B) − P(A∩B) は「2つの円を重ねた面積から、重なりを引く」だけ。条件付き確率 P(A|B) は「B の円の中で A が占める割合」。
独立ボタンを押すと P(A∩B) = P(A)·P(B) に自動調整 — 独立ってこういうこと。

Addition rule, multiplication rule, conditional probability — see them as areas before memorizing formulas.
P(A∪B) = P(A) + P(B) − P(A∩B) is just "area of two circles minus the overlap." Conditional probability P(A|B) is "the fraction of B's circle occupied by A."
Press the Independence button to snap P(A∩B) = P(A)·P(B) — that's what independence means.

具体例で考えてみよう
52枚のトランプから1枚引く。A = ハートが出る（13/52 = 0.25）、B = 絵札が出る（12/52 ≈ 0.23）。
A∩B = ハートの絵札（3/52 ≈ 0.06）。→ P(A∪B) = 0.25 + 0.23 − 0.06 = 0.42。
独立の例：サイコロ2個。A = 1個目が偶数、B = 2個目が3以上。1個目の結果は2個目に影響しないので独立。P(A∩B) = 1/2 × 2/3 = 1/3。

Concrete example
Draw one card from a 52-card deck. A = heart (13/52 = 0.25), B = face card (12/52 ≈ 0.23).
A∩B = heart face card (3/52 ≈ 0.06). → P(A∪B) = 0.25 + 0.23 − 0.06 = 0.42.
Independence example: two dice. A = 1st is even, B = 2nd is ≥3. The 1st roll doesn't affect the 2nd, so independent. P(A∩B) = 1/2 × 2/3 = 1/3.

P(A|B) = P(A∩B) / P(B) , P(A∪B) = P(A) + P(B) − P(A∩B)

左：「Bが起きた世界で、Aも起きている割合」。右：「AかBの少なくとも片方が起きる確率」＝足して重複を引く。

Left: "given B happened, what fraction also had A." Right: "probability of at least one" = add both, subtract the overlap.

▶ インタラクティブ・ベン図

▶ Interactive Venn Diagram

P(A) = 0.40

事象 A が起きる確率。0.40 = 40%。ベン図の左の円の大きさが変わる。

Probability of event A. 0.40 = 40%. Changes the size of the left circle.

P(B) = 0.30

事象 B が起きる確率。ベン図の右の円の大きさが変わる。

Probability of event B. Changes the size of the right circle.

P(A∩B) = 0.12

A と B が同時に起きる確率。円の重なり部分。独立なら P(A)×P(B) になる。

Probability of both A and B. The overlap. If independent, equals P(A)×P(B).

P(A∪B)—

P(A|B)—

P(B|A)—

独立？Independent?—

次は —離散分布と指数分布 ▸ P.04 離散分布と指数分布

UP NEXT —discrete and exponential distributions ▸ P.04 Discrete & Exponential

▸ このトピックのページへ▸ Dedicated page for this topic

P.04 / DISCRETE + EXPONENTIAL

離散分布と指数分布

Discrete & Exponential Distributions

正規分布は連続で対称。でも現実のデータはそうとは限らない。コイン投げは離散、来客数はポアソン、待ち時間は指数分布。残りの分布で道具箱を完成させよう。

The normal is continuous and symmetric. But real data isn't always — coin flips are discrete, arrivals follow Poisson, wait times are exponential. Meet the other distributions that round out the toolkit.

初めてこのページに来た方へ

正規分布の基本的な形を知っていればOK。ここでは正規分布以外の分布を3つ学びます。

New here?

Just knowing the basic normal distribution shape is enough. Here we learn 3 other distributions.

2級で問われる二項分布・ポアソン分布・指数分布の三つ。成功回数・事象発生回数・待ち時間——離散と連続の橋渡しをスライダーで体感する。

The three distributions on the level-2 exam: binomial, Poisson, and exponential. Slide through success counts, event counts, and waiting times to feel the bridge between discrete and continuous.

二項分布 B(n, p)

いつ使う？ → コイン10回で表が何回出るか、不良品検査で100個中何個不良か

When? → How many heads in 10 coin flips; how many defects in 100 items

Binomial B(n, p)

試行回数 ntrials n

コインを何回投げるか。n を増やすと山が正規分布に近づいていく。

How many coin flips. As n grows, the shape approaches the normal.

20

成功確率 psuccess p

1回の試行で成功する確率。0.5 = 公正なコイン、0.1 = 不良品率10%。

Probability of success per trial. 0.5 = fair coin, 0.1 = 10% defect rate.

0.35

n → ∞ かつ np → λ で、二項はポアソンへ。

As n → ∞ with np → λ, the binomial approaches Poisson.

ポアソン分布 Poisson(λ)

いつ使う？ → 1時間にコールセンターに来る電話の本数、1日の交通事故件数

When? → Phone calls per hour at a call center; traffic accidents per day

Poisson(λ)

発生率 λrate λ

単位時間あたりの平均発生回数。例：1時間に平均3本の電話 → λ=3。

Average events per unit time. Example: 3 calls per hour → λ=3.

3

λ が大きくなるにつれ、ポアソンは正規分布に近づく。

As λ grows, Poisson approaches the normal distribution.

指数分布 Exp(λ) — 待ち時間

いつ使う？ → 次の電話が来るまでの待ち時間、電球が切れるまでの寿命

When? → Time until next phone call; how long until a light bulb burns out

Exponential(λ) — waiting time

発生率 λrate λ

単位時間あたりの平均発生回数。大きいほど待ち時間が短くなる。

Average events per unit time. Larger λ = shorter wait times.

1

無記憶性：バスを10分待っても、あと何分待つかは今来た人と同じ。過去の待ち時間は将来に影響しない。

Memoryless: past waiting time tells you nothing about the future.

次は —条件を反転させる ▸ P.05 ベイズの定理

UP NEXT —flipping the conditional ▸ P.05 Bayes' Theorem

▸ このトピックのページへ▸ Dedicated page for this topic

P.05 / BAYES THEOREM

ベイズの定理

Bayes' Theorem

分布の道具箱が揃った。最後に条件付き確率を反転させる——これがベイズの定理。検査で陽性が出た→病気の確率は？ここで直感はよく裏切られる。

Distribution toolbox complete. Finally, Bayes' theorem flips the conditioning. You test positive — what's the chance you're actually sick? Intuition fails here; let's build it.

初めてこのページに来た方へ

条件付き確率 P(A|B) の意味を知っていると理解しやすいです。知らなくても、スライダーを触れば直感的にわかります。

New here?

Easier if you know conditional probability P(A|B). But even without it, the sliders make it intuitive.

「感度99%・特異度95%の検査で陽性」＝ 99%病気？
…答え：わずか 16.7%。医師でも半分以上が間違える超有名クイズ。
ポイントは"もともと病気の人がめっちゃ少ない"という事実を忘れてしまうこと。下の"1000人の町"を見ながら、3つのつまみを動かして自分の目で確かめよう。

"The test has 99% sensitivity & 95% specificity, and you tested positive" — is there a 99% chance you're sick?
…Answer: only 16.7%. More than half of doctors get this classic quiz wrong.
The trick is that we forget how rare the disease actually is in the first place. Play with the three sliders below and watch the "town of 1,000" — you'll see why.

有病率 — 1000人のうち何人が病気？Prevalence — how many of 1,000 are sick? = 10人 / 1000

低いほど"めったにいない病気"。つまみを右に動かすと"よくある病気"になる。

Lower = rare disease. Drag right for "common disease".

感度 — 病気の人を「陽性」と当てる割合Sensitivity — % of sick people the test correctly flags = 99%

病気の人100人中、何人を「あなたは陽性です」と検査が見つけられるか。

Of 100 sick people, how many does the test flag as positive?

特異度 — 健康な人を「陰性」と判定する割合Specificity — % of healthy people correctly cleared = 95%

健康な100人中、何人を「あなたは陰性です」と正しく返せるか。残りは誤って陽性になる＝偽陽性。

Of 100 healthy people, how many does the test correctly clear? The rest become false positives.

陽性だった人が本当に病気の確率If you tested +, chance you're sick—

陰性だった人が本当に健康な確率If you tested NEG, chance you're healthy—

真陽性 TP（病気 & 陽性）True positives (sick & tested +)—

偽陽性 FP（健康なのに陽性）False positives (healthy but tested +)—

次は —平均を取ると正規分布になる？ ▸ I.01 中心極限定理

UP NEXT —averages become normal ▸ I.01 Central Limit Theorem

▸ このトピックのページへ▸ Dedicated page for this topic

I.01 / CENTRAL LIMIT THEOREM

中心極限定理

Central Limit Theorem

正規分布の基礎を掴んだ。ここからは推測統計。1つのデータから平均を取ったらどんな形になる？ ——ここで中心極限定理が効いてくる。サイコロでもポアソンでも、出発点は何でもいい。平均にした瞬間、世界はあの正規曲線に吸い込まれていく。

Normal distribution basics down. Now for statistical inference. What shape does the average of many samples take? Here comes the Central Limit Theorem: whatever you start with — dice, Poisson, anything — the average is pulled toward that same normal curve.

初めてこのページに来た方へ

「正規分布」と「平均」の意味がわかっていれば大丈夫です。ここでは「なぜ平均は正規分布になるのか」を体験します。

New here?

If you know "normal distribution" and "mean," you're set. Here you'll see WHY the mean becomes normal.

ちょっとヤバい事実 — もとの分布がどんなに歪んでいても、そこから n個取って平均する操作を繰り返すと、その平均たちの分布は勝手に ベル型（正規分布）に化ける。
下のラボでは 左＝もとの分布（めっちゃ歪んでいる）、右＝標本平均の分布（正規に化けていく）を並べて見せている。 n を大きくするほど、右のベルがシュッと細くなる（SE = σ/√n）。

Slightly outrageous fact — no matter how skewed the base distribution is, if you take n samples and average, then repeat, the distribution of those averages converges on its own to a bell (normal).
The lab below shows left = the raw skewed source side-by-side with right = the sample-mean distribution, so you can watch the bell emerge. Crank n up and the bell tightens (SE = σ/√n).

元の分布（歪んでるやつ）Base distribution (the skewed one)

サンプルサイズ nSample size n = 30

1回の標本で何個取るか。大きいほど標本平均の山がシュッと細くなる（SE = σ/√n）。

How many values per sample. Larger n = tighter bell for the sample mean (SE = σ/√n).

試行回数Trials0

標本平均の平均Mean of sample means—

標本平均の標準偏差SD of sample means—

理論SE = σ/√nTheoretical SE = σ/√n—

次は —平均は本当に真値に近づく？ ▸ I.02 大数の法則

UP NEXT —does the sample mean really converge? ▸ I.02 Law of Large Numbers

▸ このトピックのページへ▸ Dedicated page for this topic

I.02 / LAW OF LARGE NUMBERS

大数の法則

Law of Large Numbers

CLT で「平均を取ると正規分布になる」と分かった。でもそもそも 標本平均 は、n を増やすと本当の平均に近づくのか？それを保証するのが大数の法則。CLT が「形」の話なら、LLN は「中心が動かない」話。

CLT showed that averages become normal. But does the sample mean actually converge to the true mean as n grows? That guarantee is the Law of Large Numbers. CLT describes the shape; LLN says the center won't run away.

初めてこのページに来た方へ

CLT を見ていなくても大丈夫。「コインを何万回も投げると表の割合は50%に近づく」— それがこのページのすべて。

New here?

Haven't seen CLT? That's fine. "Flip a coin 10,000 times and the heads ratio approaches 50%" — that's this page.

コイン投げで最初の10回連続で表が出た — これ、別に珍しいことじゃない。でも 1万回投げたら、表の割合はほぼ ぴったり 0.5 に収まる。
これが大数の法則。サンプルを増やすほど、観測値は"真の値"に吸い寄せられていく。 統計が"なんとなく"じゃなく"証拠"になる理由がここにある。

10 heads in a row at the start of a coin-flip? Not that weird. But flip it 10,000 times and the head-ratio locks onto almost exactly 0.5.
That's the Law of Large Numbers — the more samples you draw, the more observed values get pulled toward the truth. This is why statistics counts as evidence, not a vague hunch.

確率 pProbability p = 0.5

表が出る確率。0.5 なら公正なコイン。0.7 にすると「偏ったコイン」。それでも大数の法則は成り立つ。

Probability of heads. 0.5 = fair coin. Try 0.7 for a biased coin — the law still holds.

試行回数Trials0

現在の平均Current mean—

理論値Theoretical0.50

次は —有限サンプルで不確実性をどう表す？ ▸ I.03 信頼区間

UP NEXT —how to quantify uncertainty with finite n ▸ I.03 Confidence Interval

▸ このトピックのページへ▸ Dedicated page for this topic

I.03 / CONFIDENCE INTERVAL

信頼区間

Confidence Interval

LLN は「∞なら当たる」と言う。でも我々はいつも有限サンプルしか持っていない。なら点推定の周りに網を張って真値を捕まえよう——これが信頼区間。幅を広げれば当たりやすい、狭めれば精密。このトレードオフを目で見て掴む。

LLN says "at infinity, you're right." But in practice we always have a finite sample. So instead of a single point, drape a net around it — that's a confidence interval. Wider net, easier to catch; narrower, more precise. Watch the trade-off play out.

初めてこのページに来た方へ

「標本平均」と「標準偏差」の意味がわかっていればOK。ここでは推定の不確実性を「幅」で表現する方法を学びます。

New here?

If you know "sample mean" and "standard deviation," you're ready. Here we express estimation uncertainty as a "width."

95% 信頼区間って実はよく誤解される概念。
「真の値が95%の確率でここに入る」 …ではなくて、「同じサンプリングを何百回も繰り返すと、そのうち約95%の区間が真の値を掴む」が正しい。
下のラボではそれをゴリ押しで実演する。ピンクの細い線が"捕まえられなかった不運な区間"。全体のピンク比率が ちゃんと5%前後に落ち着くのを確認できたら、もう信頼区間は分かったも同然。

The 95% confidence interval is famously misunderstood.
It does NOT mean "the true value is inside with 95% probability". The correct reading: "repeat this sampling many times, and ~95% of the resulting intervals will capture the true value".
The lab below brute-forces that intuition. Thin pink = the unlucky intervals that missed. Once the pink share settles around ~5%, you've got it.

x̄ ± z_α/2 · σ/√n

日本語で読むと：標本平均 ± 信頼度に応じた係数 × 標準誤差（バラつき÷√サンプル数）。この幅の「網」で真値を捕まえる。

In words: sample mean ± confidence multiplier × standard error (spread ÷ √sample size). This "net" catches the true value.

n = 30

1回の標本サイズ。大きいほど区間が狭くなる（精度が上がる）。

Sample size per interval. Larger n = narrower interval (higher precision).

信頼度Confidence = 95%

何%の区間が真値を捕まえるか。95% なら100本中約5本がピンク（外れ）になる。

What % of intervals catch the true value. At 95%, about 5 of 100 bars turn pink (miss).

作成した区間Intervals built0

捕捉率Coverage—

期待値Expected95%

次は —幅から YES/NO へ ▸ I.04 仮説検定

UP NEXT —from width to yes/no ▸ I.04 Hypothesis Testing

▸ このトピックのページへ▸ Dedicated page for this topic

I.04 / HYPOTHESIS TESTING

仮説検定

Hypothesis Testing

信頼区間が「幅として」不確実性を出す道具なら、仮説検定は「YES/NO として」それを使う道具。帰無仮説の世界でこのデータは起こりえるか？起こりにくいなら reject——同じ分布、同じ σ／n、質問が違うだけ。

If a CI expresses uncertainty as a width, hypothesis testing turns it into a yes/no decision. Under the null world, could this data have happened? If it's too unlikely, reject. Same distribution, same σ/n — just a different question.

初めてこのページに来た方へ

信頼区間の考え方と正規分布の基本がわかると理解しやすいです。「p値」は聞いたことがあるレベルでOK。

New here?

Easier with confidence intervals and basic normal distribution. Having heard of "p-value" is enough.

検定 = 裁判だと思うと超わかりやすい。
「H₀（帰無仮説）：この薬は効かない（＝無罪）」をいったん仮置きし、データから計算した 検定統計量 z が 事前に決めた棄却域 に落ちたら有罪宣告 — つまり H₀ を棄却 する。
ここでは2画面で攻める：① z値と棄却域の幾何学（両側・右側・左側）・ ② 冤罪(α)と見逃し(β)のトレードオフ。

Think of testing as a trial.
You start by assuming H₀ ("the drug has no effect" = "innocent"). Then if your computed test statistic z lands in the pre-chosen rejection region, you convict — that is, reject H₀.
Two panels below: ① geometry of z and rejection regions (two-sided, right, left), and ② false alarms (α) vs. misses (β).

▶ ① 基本：z値と棄却域

▶ ① Basics: z-statistic & rejection region

観測 zObserved z = 1.96

データから計算した検定統計量。棄却域（赤い領域）に入れば H₀ を棄却する。

Test statistic from the data. If it lands in the red rejection region, reject H₀.

α = 0.05

有意水準。「冤罪リスクをどこまで許容するか」。通常は 0.05（5%）を使う。

Significance level. "How much false-alarm risk you accept." Usually 0.05 (5%).

検定タイプTest type

検定統計量 zTest statistic z—

臨界値Critical value—

p値p-value—

判定Decision—

▶ ② 2つの誤り：α・β・検出力

▶ ② Two kinds of errors: α, β, power

検定には2種類の間違いがある。
第1種の誤り α: H₀ が本当なのに棄却してしまう（冤罪）。
第2種の誤り β: H₁ が本当なのに見逃してしまう（真犯人を逃す）。
そして 1 − β が検出力 (Power)。効果量 δ（本当の差の大きさ）や α を動かすと、青(H₀)と紫(H₁)の曲線がせめぎ合い、 "間違いを減らすと見逃しが増える"というトレードオフが見える。
Tip: グラフ上を左右にドラッグすると、臨界値（α の境界）を直接動かせる。

Testing has two kinds of mistakes.
Type I error α: rejecting H₀ when it's actually true (false alarm).
Type II error β: failing to reject H₀ when H₁ is actually true (a miss).
And 1 − β is the power. Change effect size δ or α: the blue (H₀) and purple (H₁) curves fight it out — you can literally see the trade-off "fewer false alarms = more misses".
Tip: drag horizontally on the chart to slide the critical boundary (α).

効果量 δEffect size δ = 2.0

効果量 = 本当の差の大きさ。「薬がどれくらい効くか」の指標。大きいほど検出しやすい。

Effect size = true difference magnitude. "How much the drug actually works." Larger = easier to detect.

α = 0.050

α を厳しく（小さく）すると冤罪は減るが、見逃し（β）が増える。このトレードオフを観察しよう。

Tighter α reduces false alarms but increases misses (β). Watch the trade-off.

α (第1種の誤り)α (Type I error)—

β (第2種の誤り)β (Type II error)—

検出力 1−βPower 1−β—

次は —σ を知らない世界へ ▸ I.05 t・χ²・F

UP NEXT —into the world where σ is unknown ▸ I.05 t, χ², F

▸ このトピックのページへ▸ Dedicated page for this topic

I.05 / t · χ² · F DISTRIBUTIONS

三大検定分布

The Three Test Distributions

ここまで平均の検定には σ を知っている前提だった。現実では σ も推定するしかない。その瞬間 Z は t 分布に化ける。分散を直接検定するなら χ²、2つの分散を比べるなら F。全部 N(0,1) の子孫だけど、何を知らないかで名前が変わる。

Up to now we've tested means assuming σ is known. In practice you must estimate σ too — and the moment you do, Z morphs into t. Test a variance directly: χ². Compare two variances: F. All descendants of N(0,1); the name changes based on what you don't know.

初めてこのページに来た方へ

標準正規分布と仮説検定の基本を知っていると理解しやすいです。「母分散がわからないとき、どうするか」がテーマです。

New here?

Easier with standard normal and hypothesis testing basics. The theme: "what happens when population variance is unknown."

t・χ²・F は、どれも正規分布から"作って"生まれた派生分布。 "もとは標準正規なんだけど、標本からしか情報を取れない現実"を反映するためにスケーリングしたもの、と思うとスッキリする。
ざっくり使い分けると — t：母分散を知らずに平均を検定する時（＝現実の平均検定はほぼ全部これ）。 χ²：分散そのものの検定、独立性や適合度（カテゴリカル）。 F：分散比の検定（分散分析 ANOVA、回帰の全体 F 検定）。
自由度 df を動かすと、t は df→∞ で N(0,1) に一致し、χ²／F は df が大きいほど対称なベル形に近づく。これ自体、裏では中心極限定理が効いている。

t, χ², F are all derived from the normal. Think of them as "the standard normal, scaled to reflect that we only ever see a sample".
Use them for: t — testing a mean when the population variance is unknown (i.e. nearly every real test of a mean); χ² — testing a variance, independence, goodness-of-fit for categorical data; F — ratios of variances (ANOVA, the overall F in regression).
Slide df: t converges to N(0,1) as df→∞, and χ²/F get more symmetric with more df. The CLT is quietly doing the work under the hood.

▶ t distribution

作り方: t = Z / √(χ²ₖ/k) ， Z~N(0,1)。
使いどころ: 母分散未知の平均検定、回帰係数の t 値。
例：クラス30人の平均点が全国平均と違うか調べるとき。
クセ: 正規より裾が重い（外れ値に優しい）。df→∞ で N(0,1)。

ざっくり言うと：母集団の σ を知らず標本から推定するので、その分だけ不確実性が増えて裾が厚くなった正規分布。サンプルが増えれば（df→∞）正規分布に戻る。

Built from: t = Z / √(χ²ₖ/k), Z~N(0,1).
Use for: testing means with unknown variance, regression t-values.
Flavor: heavier tails than N(0,1); matches N(0,1) as df→∞.

In short: since you estimate σ from your sample instead of knowing it, extra uncertainty makes the tails fatter. More data (df→∞) and it becomes the normal.

df = 3

自由度 ≒ サンプル数−1。小さいと裾が重い（外れ値に備える）。大きくすると正規分布に一致。

Degrees of freedom ≈ sample size − 1. Small = heavy tails. Large = matches the normal.

↔ グラフを左右にドラッグで df 変更↔ Drag the graph horizontally to change df

▶ χ² distribution

作り方: χ²ₖ = Z₁² + Z₂² + ... + Zₖ² （標準正規を k 個足して二乗和）。
使いどころ: 分散の検定、独立性／適合度のカイ二乗検定。
例：サイコロの出目が均等か、アンケートの「はい/いいえ」に偏りがないか調べるとき。
クセ: 非負・右に歪む。平均 = k、分散 = 2k。df大で正規ベル化。

Built from: χ²ₖ = Z₁² + Z₂² + ... + Zₖ² (sum of k squared standard normals).
Use for: variance tests, chi-square tests of independence / goodness-of-fit.
Flavor: non-negative, right-skewed. Mean = k, variance = 2k. Goes bell-shaped with large df.

df (k) = 3

独立な標準正規変数の個数。平均 = df、分散 = 2×df。大きくすると左右対称に近づく。

Number of independent standard normals squared. Mean = df, variance = 2×df.

↔ グラフを左右にドラッグで df 変更↔ Drag the graph horizontally to change df

▶ F distribution

作り方: F = (χ²ₘ/m) / (χ²ₙ/n) （2つの独立な χ² の比）。
使いどころ: 分散分析（ANOVA）、回帰モデルの全体 F 検定。
例：3クラスの平均点に差があるか調べるとき（一元配置分散分析）。
クセ: 非負・右歪み。分子/分母の df で形が変わる。

Built from: F = (χ²ₘ/m) / (χ²ₙ/n) (ratio of two independent χ² / df).
Use for: ANOVA, overall F-test in regression.
Flavor: non-negative, right-skewed. Shape depends on both df.

df1 = 3

分子の自由度（グループ数−1）。分散分析ならグループ数で決まる。

Numerator df (number of groups − 1). In ANOVA, determined by the number of groups.

df2 = 10

分母の自由度（全データ数−グループ数）。大きいほど分布が安定する。

Denominator df (total observations − number of groups). Larger = more stable distribution.

↔ グラフを左右にドラッグで df₁ 変更　·　df₂ はスライダーで調整↔ Drag horizontally to change df₁ · df₂ is set via slider

次は —「ズレ」を数値化して検定する ▸ I.06 カイ二乗検定

UP NEXT —quantifying "deviation" and testing it ▸ I.06 Chi-Squared Test

▸ このトピックのページへ▸ Dedicated page for this topic

I.06 / CHI-SQUARED TEST

カイ二乗検定

Chi-Squared Test

ここまでは「平均」の検定だった。でも世の中にはカテゴリしかないデータがある — アンケートの選択肢、サイコロの出目、病気と喫煙の組み合わせ。こういうデータの「ズレ」を測るのがカイ二乗検定。期待からのズレが大きいほど、χ² 統計量が光る。

So far we've tested means. But some data is purely categorical — survey choices, dice outcomes, disease × smoking. The chi-squared test quantifies "deviation from expectation" for these counts. The bigger the mismatch, the brighter the χ² statistic glows.

適合度検定は「観測されたカテゴリ分布は、理論分布と合っているか？」を調べる。サイコロが公正かどうか、が典型例。
独立性検定は「2つのカテゴリ変数は独立か？」を調べる。クロス集計表の各セルで期待度数とのズレを計算し、χ² = Σ (O−E)²/E を合計する。
なぜ E で割る？ → 「期待10人に対して2人のズレ」と「期待1000人に対して2人のズレ」は重みが違う。E で割ることで相対的なズレに揃えている。
どちらも χ²分布に従う統計量を使い、右裾の面積が p 値になる。自由度は適合度なら k−1、独立性なら (r−1)(c−1)。

Goodness-of-fit asks: "Does the observed category distribution match a theoretical one?" Classic example: is the die fair?
Test of independence asks: "Are two categorical variables independent?" Compute χ² = Σ (O−E)²/E across every cell of the contingency table.
Why divide by E? → A deviation of 2 from an expected 10 matters more than 2 from an expected 1,000. Dividing by E turns raw gaps into relative ones.
Both use a χ²-distributed statistic; the p-value is the right-tail area. df = k−1 for goodness-of-fit, (r−1)(c−1) for independence.

▶ ① 適合度検定 — サイコロは公正か？

▶ ① Goodness-of-Fit — Is the Die Fair?

🎲 左側のバーをクリックして出目を1つずつ追加（Shift+クリックで−1）🎲 Click a bar on the left to add one roll (Shift+click to subtract)

有意水準 α Significance α = 0.05

帰無仮説を棄却する基準。0.05 = 「5%以下の確率でしか起きないなら偶然ではない」。

Threshold for rejecting H₀. 0.05 = "if this would happen less than 5% by chance, it's not random."

投下数 nRolls n 0

検定統計量 χ²Test statistic χ² —

自由度 dfdf —

p値p-value —

判定Decision —

▶ ② 独立性検定 — 2変数は独立か？

▶ ② Test of Independence — Are Two Variables Independent?

左側のセルをクリックして観測値を+1（Shift+クリックで−1）Click a cell on the left to add +1 (Shift+click for −1)

有意水準 α Significance α = 0.05

独立性検定の有意水準。小さいほど慎重な判定になる。

Significance level for the independence test. Smaller = more conservative.

総数 nTotal n 0

検定統計量 χ²Test statistic χ² —

自由度 dfdf —

p値p-value —

判定Decision —

次は —関係を直線で捕まえる ▸ M.01 単回帰分析

UP NEXT —catching a relationship with a line ▸ M.01 Simple Regression

▸ このトピックのページへ▸ Dedicated page for this topic

M.01 / SIMPLE REGRESSION

単回帰分析（最小二乗法）

Simple Regression (OLS)

ここまで 1つの変数の話。現実の問題は「身長と体重」「広告費と売上」のように関係を聞いてくる。単回帰は 2 変数に直線を 1 本引くだけ——でも、その傾き β̂ の背後にはさっきまでやった t 検定・信頼区間 がしっかり効いている。

Up to here, one variable at a time. Real questions involve relationships — height vs. weight, ad spend vs. sales. Simple regression draws one line through two variables — and the t-tests and CIs you just learned power the inference on its slope β̂.

初めてこのページに来た方へ

「平均」「標準偏差」「相関」の意味がわかればOK。中学の「y = ax + b」を思い出せれば完璧。

New here?

If you know "mean," "standard deviation," and "correlation," you're set. Remember y = ax + b from school? Perfect.

説明変数が1つだけの回帰が単回帰。x が1増えると y は β₁ だけ動く、という線形関係を仮定する。 最小二乗法は、全ての点との縦方向の差（残差）の二乗和を最小化する直線を選ぶ方法。キャンバスをクリックすると点が追加され、回帰直線が"ぴろん"と動く。緑のバーが残差。R² は「どれだけ直線で説明できたか」の指標（0〜1）。

Regression with just one explanatory variable is simple regression. It assumes a linear relationship: when x increases by 1, y moves by β₁. Ordinary least squares (OLS) picks the line that minimizes the sum of squared vertical residuals. Click the canvas to add points and watch the line snap into place. Green bars are residuals. R² (in 0–1) measures how much of y the line explains.

ŷ = β₀ + β₁x , β₁ = Σ(xᵢ−x̄)(yᵢ−ȳ) / Σ(xᵢ−x̄)²

β₁ の分子は「x と y が一緒に動く量」、分母は「x のバラつき」。割り算で「x が1増えたら y はいくつ動くか」が出る。

Numerator of β₁ = "how much x and y move together." Denominator = "how much x varies." The ratio = "y's change per unit x."

↑ キャンバスをクリックして点追加↑ Click the canvas to add points

n0

傾き β₁Slope β₁—

切片 β₀Intercept β₀—

R²—

相関係数 rCorrelation r—

次は —他の影響を取り除く ▸ M.02 重回帰分析

UP NEXT —controlling for everything else ▸ M.02 Multiple Regression

▸ このトピックのページへ▸ Dedicated page for this topic

M.02 / MULTIPLE REGRESSION

重回帰分析

Multiple Regression

単回帰は 1 本の線。でも他の影響を取り除きたい——広告費の効果を「曜日や季節を固定したうえで」見たい。それが重回帰。軸が増え、偏回帰係数はまわりをコントロールした上での効き目になる。

Simple regression is one line. But often you want to strip away other effects — the impact of ad spend holding day-of-week and season fixed. That's multiple regression. Add dimensions, and each partial coefficient tells you the effect controlling for everything else.

初めてこのページに来た方へ

単回帰（1本の直線を引く）を知っていると理解しやすいです。ここでは直線が「平面」に拡張されます。

New here?

Easier if you know simple regression (fitting one line). Here the line extends to a "plane."

説明変数が2つ以上ある場合が重回帰。 x₁（例：勉強時間）と x₂（例：睡眠時間）から y（テスト点）を予測する、のように複数の要因を同時に扱う。回帰"直線"ではなく、回帰平面になる。x₁ を1増やしたときの y への効果（他の変数を固定したうえで）が β₁、x₂ に対するのが β₂。真のパラメータを設定してデータを生成し、推定された係数と真の値を比較しよう。ドラッグでキャンバスを回転すると、平面とデータ点の立体構造が見える。
※ 可視化できるのは x₁, x₂ の 2 変数まで（人間の目は 3 次元が限界）。 でも数式上は ŷ = β₀ + β₁x₁ + β₂x₂ + β₃x₃ + … + β_kx_k といくらでも変数を足せる。 x₃ 以降は "グラフにできないだけ" で、推定の手続き β̂ = (XᵀX)⁻¹Xᵀy はそのまま機能する。実務では 5〜50 変数くらいがごく普通。

With two or more explanatory variables, it's multiple regression. Predict y (e.g., test score) from x₁ (study hours) and x₂ (sleep hours), handling several factors at once. Instead of a regression line, you get a regression plane. β₁ is the effect on y of a unit change in x₁ holding x₂ fixed; β₂ is the same for x₂. Set true parameters, generate data, and compare the estimates to the truth. Drag the canvas to rotate and see the plane and data in 3D.
Note: only 2 predictors can be drawn (our eyes top out at 3-D). But the equation keeps going — ŷ = β₀ + β₁x₁ + β₂x₂ + β₃x₃ + … + β_kx_k — you can add as many variables as you like. From x₃ onward you just "can't draw it", but the estimator β̂ = (XᵀX)⁻¹Xᵀy works exactly the same. In practice, 5–50 predictors is very normal.

ŷ = β₀ + β₁x₁ + β₂x₂ , β̂ = (XᵀX)⁻¹Xᵀy

行列で書くと一行だが、やっていることは「全変数を同時に考慮した最小二乗法」。手計算は不要——コンピュータがやる部分。

Compact matrix notation, but it's just "OLS considering all variables at once." No hand calculation needed — the computer handles this.

実験ガイド — 順番に試してみよう

Step 1: ノイズ σ を 0 にして再サンプリング → 推定値と真の値がぴったり一致することを確認。
Step 2: σ を 0.5 に上げる → 推定値が真の値からズレ始める。何度か再サンプリングして、推定値のバラつきを見よう。
Step 3: n を 10 にして再サンプリング → 推定が不安定。n を 200 にすると安定する。これが大数の法則。
Step 4: β₁ を 0 にする → x₁ は y に影響しない。推定 β̂₁ が 0 に近いか確認しよう。

Experiment Guide — try these in order

Step 1: Set noise σ to 0 and resample → estimates match the true values exactly.
Step 2: Raise σ to 0.5 → estimates start drifting. Resample several times to see the variation.
Step 3: Set n to 10 and resample → unstable. Set n to 200 → stable. That's the law of large numbers.
Step 4: Set β₁ to 0 → x₁ has no effect on y. Check that estimated β̂₁ is close to 0.

真の β₁True β₁ = 0.80

「正解」の β₁。推定値 β̂₁ と比べて、どれだけ正確に当てられるか確認しよう。

The "true" β₁. Compare with the estimated β̂₁ to see how close the estimate gets.

真の β₂True β₂ = -0.50

「正解」の β₂。β₁ と独立に設定できる。片方を 0 にすると「影響なし」になる。

The "true" β₂. Set independently from β₁. Setting to 0 means "no effect."

ノイズ σNoise σ = 0.50

ノイズの大きさ。0 にすると完璧にフィット。大きくすると推定がブレやすくなる。

Noise magnitude. 0 = perfect fit. Larger = noisier estimates.

n = 40

サンプル数。多いほど推定が安定する（大数の法則）。10→200 で変化を見よう。

Sample size. More = more stable estimates (law of large numbers). Try 10 vs. 200.

↔↕ ドラッグで回転（軸: x₁ · x₂ · y）↔↕ Drag to rotate (axes: x₁ · x₂ · y)

推定 β̂₀Est. β̂₀—

推定 β̂₁Est. β̂₁—

推定 β̂₂Est. β̂₂—

R²—

STATISTICS

コラム

Columns

標準正規分布 — すべてのはじまり

Standard Normal — The Origin of Everything

▶ 「68 - 95 - 99.7」は暗記じゃなくて見て分かる

▶ "68 - 95 - 99.7" — no memorization, just see it

▶ 正規分布、ぜんぶ"あの一本"に化ける瞬間

▶ Watch every normal collapse onto "that one curve"

正規分布と標準化

Normal Distribution & Standardization

確率の基本法則 — ベン図で直感をつかむ

Probability Rules — Build Intuition with Venn Diagrams

▶ インタラクティブ・ベン図

▶ Interactive Venn Diagram

離散分布と指数分布

Discrete & Exponential Distributions

ベイズの定理

Bayes' Theorem

中心極限定理

Central Limit Theorem

大数の法則

Law of Large Numbers

信頼区間

Confidence Interval

仮説検定

Hypothesis Testing

▶ ① 基本：z値と棄却域

▶ ① Basics: z-statistic & rejection region

▶ ② 2つの誤り：α・β・検出力

▶ ② Two kinds of errors: α, β, power

三大検定分布

The Three Test Distributions

▶ t distribution

▶ χ² distribution

▶ F distribution

カイ二乗検定

Chi-Squared Test

▶ ① 適合度検定 — サイコロは公正か？

▶ ① Goodness-of-Fit — Is the Die Fair?

▶ ② 独立性検定 — 2変数は独立か？

▶ ② Test of Independence — Are Two Variables Independent?

単回帰分析（最小二乗法）

Simple Regression (OLS)

重回帰分析

Multiple Regression