← back to essays/

Regression to the Mean

2026-03-23

A basketball player shoots 80% from the free throw line over her career. In last night's game, she went 2 for 10. Just 20%. The coach benches her. The announcer says she's "lost her touch." Fans wonder if something is wrong.

Then she goes 9 for 10 the next game. Everyone says the benching "worked" or she "bounced back." But what if nothing actually changed? What if she was the same 80% shooter the whole time, and small samples are simply noisy?

The Core Idea

Extreme results tend to be followed by less extreme results. Not because of any force pulling things back to average, but because extreme outcomes usually require unusual luck, and unusual luck usually doesn't repeat.

An 80% shooter who goes 2 for 10 probably didn't suddenly become a 20% shooter. She just hit the unlucky tail of her normal range. Next game, she'll probably be closer to 80%. She didn't "recover." That's just where her true ability lives, and where most samples will land.

See It: One Shooter, Many Games

Below is a simulated player with a true shooting ability you can set. Click "Shoot a game" to see individual game results, and watch how wildly they bounce around with small sample sizes.

80%
10

Press "Shoot a game" to start

--
True ability: 80%
Game-by-game results (each cell = one game):
Running average: --

The Danger: Mistaking Noise for Signal

We're wired to find causes for everything we see. When a player has an extreme game, we invent explanations: she's tired, she's distracted, the other team's defense is too good. When she "bounces back," we credit the coach's halftime speech or a change in strategy.

But if her true ability hasn't changed, the bounce-back isn't a comeback. It's the numbers settling back to where they usually live. That's a subtle but important distinction.

This is why regression to the mean fools us so badly:

Sample Size Is Everything

The amount of noise depends directly on how many observations you have. Use the simulation below to see this in action. Generate thousands of 5-game stretches vs 50-game stretches and compare how wildly they vary.

80%

5 shots per sample

--
Lowest
--
Average
--
Highest

50 shots per sample

--
Lowest
--
Average
--
Highest
5
50
Click a simulate button to generate samples and compare the spread.

The Takeaway

Regression to the mean isn't a force. Nothing "causes" results to move toward average. It's the mathematical reality that extreme outcomes require extreme luck, and luck doesn't persist.

A useful habit: before explaining why something went up or down, first ask how big is the sample? If it's small, the most likely explanation for an extreme result might just be randomness. And the most likely next result is a less extreme one. Not because anything changed, but because that's where probability tends to land.

Where This Shows Up

Once you internalize this, you start seeing misattributed regression everywhere:

The transferable skill: before you credit the intervention, ask whether the outcome was likely to moderate on its own. This doesn't mean interventions never work. It means the bar for claiming they do is higher than "things got better afterward." Things that are extreme tend to get less extreme. That's the default, not the exception.