Skip to main content

Binomial, Poisson, and Exponential Distributions

Success counts, event counts, waiting times — three distributions that look unrelated until you slide n and p and watch them merge into one road.

P.05 / DISCRETE + EXPONENTIAL

Discrete & Exponential — Counting Probability Models

The normal is continuous and symmetric. But real data isn't always — coin flips are discrete, arrivals follow Poisson, wait times are exponential. Meet the other distributions that round out the toolkit.
Three core distributions every stats learner runs into: binomial, Poisson, and exponential. Slide through success counts, event counts, and waiting times to feel the bridge between discrete and continuous.
Experiment Guide — Binomial
  1. Step 1: n=20, p=0.5 → symmetric bell. Change p to 0.1 → skews right.
  2. Step 2: Set n=50, p=0.06 → np≈3. Compare with Poisson(λ=3) below — nearly identical.
  3. Step 3: Keep n large, p small → watch binomial converge toward Poisson.
When? → How many heads in 10 coin flips; how many defects in 100 items
Binomial B(n, p)
20
0.35
As n → ∞ with np → λ, the binomial approaches Poisson.

// Formula used here — Binomial

Worked example (flip a coin 10 times, probability of exactly 3 heads)
• p=0.5, n=10, k=3
• p³ = 0.5³: probability of "3 heads in a row"
• (1−p)⁷ = 0.5⁷: probability of "remaining 7 are all tails"
• Multiply: probability of one specific sequence like HHHTTTTTTТ
• But there are other arrangements with exactly 3 heads (HTHTTT… etc.). Total: ₁₀C₃ = 120 ways
• So P(X=3) = 120 × 0.5³ × 0.5⁷ ≈ 0.117

Key properties
• Mean = np (10 × 0.5 = 5 — matches intuition)
• Variance = np(1−p). Variance is largest when p = 0.5 — outcomes are most uncertain when it's a coin toss

Experiment Guide — Poisson
  1. Step 1: λ=1 → concentrated near 0. λ=5 → approaches a bell shape.
  2. Step 2: Raise λ to 20 → looks just like a normal bell (Central Limit Theorem preview).
  3. Step 3: λ is both the mean and variance. Set λ=3 and check "E[X] = Var[X] = λ = 3.00" at the top-left of the graph.
When? → Phone calls per hour at a call center; traffic accidents per day
Poisson(λ)
3
As λ grows, Poisson approaches the normal distribution.

// Formula used here — Poisson

Worked example (average 3 emails per hour — probability of exactly 5?)
• λ=3, k=5
• λ⁵ = 3⁵ = 243: "5 events' worth of power" (larger λ = more likely to see many)
• e⁻³ ≈ 0.050: the probability of seeing zero events — baseline
• 5! = 120: we don't care about the order of the 5 arrivals
• P(X=5) = 243 × 0.050 / 120 ≈ 0.101

Connection to binomial — worth remembering
• Make n very large and p very small while keeping np = λ → binomial becomes Poisson
• Try n=50, p=0.06 in the simulator above — it matches Poisson(λ=3) almost perfectly
• The defining property: mean = variance = λ. If this doesn't hold for your data, Poisson is the wrong model

Experiment Guide — Exponential
  1. Step 1: λ=1 → mean wait = 1. λ=2 → mean wait = 0.5. Larger λ = "happens sooner."
  2. Step 2: Set λ to 0.1 → long tail. Rare events have long waits.
  3. Step 3: Memoryless: having waited 10 min doesn't change how much longer you'll wait.
When? → Time until next phone call; how long until a light bulb burns out
Exponential(λ) — waiting time
1
Memoryless: past waiting time tells you nothing about the future.

// Formula used here — Exponential

Worked example (buses arrive every 5 min on average. Probability of waiting more than 10 min?)
• λ = 1/5 (0.2 per minute), t = 10
• P(X > 10) = e⁻⁰·²ˣ¹⁰ = e⁻² ≈ 0.135 → about 13.5%
• The right formula gives "probability of waiting t or more." It drops rapidly toward zero as t grows

Mirror relationship with Poisson
• Poisson counts "how many events in a fixed time" (count distribution)
• Exponential measures "how long until the next event" (time distribution)
• Same λ, two perspectives — one for counts, one for waiting times

Memorylessness — the defining feature of the exponential
• Even after waiting 10 minutes, the probability of "arriving within the next 5 min" is the same as for someone who just showed up
• "I've waited a long time so it must come soon" doesn't apply
• This only holds when the event rate is constant over time (e.g., radioactive decay)
• Real buses have schedules, so real waiting is not memoryless
• The opposite, in fact: with a schedule, the longer you've waited the closer the next bus is — so arrival probability rises with waiting time. That's what being non-memoryless actually looks like

// How the three distributions connect — learning them separately is a waste

Textbooks place these on different pages, making them look like three unrelated distributions. In reality they're deeply connected. Once the links become visible, the whole picture snaps into focus.

Binomial ──n large, p small──→ Poisson │ │ │ np, n(1-p) both large │ "time until ↓ ↓ next event" Normal approximation Exponential
  • Binomial → Poisson: defect rate 0.5%, inspect 200 items (n=200, p=0.005) → approximate with Poisson λ=1
  • Poisson ↔ Exponential: "λ events per hour" is Poisson; "time until next event" is exponential. Same phenomenon, two angles
  • Binomial → Normal: when np ≥ 5 and n(1−p) ≥ 5, approximate with N(np, np(1−p)) — a consequence of the Central Limit Theorem

// Common misconceptions

❌ "Poisson can always substitute for binomial"

Only when p is small and n is large. Try using Poisson to approximate a coin flip (p=0.5, n=20) — it won't even be close. Use the simulator above to see for yourself.

❌ "I've waited 10 minutes, so the bus must come soon"

Wrong under exponential (memoryless) assumptions. With a real bus schedule, the hazard ("probability of arrival in the next minute") rises the longer you've waited — that's what being non-memoryless looks like mathematically (territory handled by Weibull or deterministic arrival models). The exponential applies only to purely random events.

※ Weibull is outside StatPlay's scope, but in most textbooks it sits as the very next chapter after the exponential. If the idea pulls you in, that's the page to flip to.

❌ "Binomial variance is np"

np is the mean. The variance is np(1−p). Forgetting the (1−p) factor is a classic slip. Variance is smallest when p is near 0 or 1 (outcomes are nearly certain).

// Shapes you'll meet again

Around discrete distributions, the same rewrites and the same mean/variance pairings keep showing up.

  • The complement shape: "probability of 3 or more successes" reappears as 1 − P(0) − P(1) − P(2). "Subtract the lower tail from the whole" is the same picture each time
  • The Poisson-approximation shape: in regimes where n is large and p is small (defect rate 2%, n = 100), Poisson with λ = np = 2 stands in for the binomial
  • The exponential survival shape: a 1000-hour mean lifetime with P(X > 500) = e⁻⁰·⁵ ≈ 0.607. P(X > t) = e⁻λt is the same shape that returns whenever lifetimes or waiting times are involved
  • Mean–variance pairs: Poisson carries mean = variance = λ, while binomial carries variance = np(1−p). Each distribution travels with its own paired shape
Memoryless Demo — Does "already waited 10 min" matter?
Left: normal-like waiting. The longer you wait, the more likely arrival becomes. Right: exponential. Move t all you want — the curve never changes. That's memorylessness.
Normal (everyday intuition)
← as t grows, shifts left ("should come soon")
Exponential (memoryless)
← no matter how much t changes, same shape
0
0.20
One Phenomenon, Three Views — A Call Center Hour
One process — "λ calls per hour" — viewed through three distributions simultaneously. Move λ and all three update. Same phenomenon, different angles.
3.0
Binomial B(60, λ/60)
60 one-minute slots.
Each minute: call or no call.
Poisson Poi(λ)
How many calls total in one hour?
Exponential Exp(λ)
How long until next call?
(mean 60/λ min)
UP NEXT —averages become normal I1 Central Limit Theorem