← back to essays/

The Prosecutor's Fallacy

2026-03-23

A crime has been committed. DNA found at the scene is tested against a database of 100,000 people. One person's DNA matches. The test is 99.9% accurate, with a false positive rate of 0.1%.

The prosecutor tells the jury: "There is only a 0.1% chance this match is wrong. The probability of the defendant's innocence is one in a thousand."

This sounds compelling. It is also completely wrong, and understanding why is one of the most useful things you can learn about probability.

Two Very Different Questions

The prosecutor is conflating two probabilities that feel similar but are vastly different:

What the prosecutor says P(match | innocent) = 0.1%
"If you're innocent, there's only a 0.1% chance your DNA matches."
What the jury needs to know P(innocent | match) = ???
"Given that your DNA matched, what's the chance you're actually innocent?"
These are not the same number. The difference can be the difference between conviction and acquittal.

The first probability is about the test's accuracy. The second is about the defendant's guilt. They are not interchangeable, and the gap between them depends on something the prosecutor didn't mention: how many people were tested.

The Math That Changes Everything

If you search a database of 100,000 people with a test that has a 0.1% false positive rate, you should expect about 100 false matches (100,000 × 0.001). Plus the one true match (the actual perpetrator, if they're in the database).

So you have ~101 total matches, and only 1 is the real perpetrator. If the defendant was found through this database search, the probability they're guilty isn't 99.9%. It's closer to 1 in 101, or about 1%.

The test is extremely accurate. And the defendant is almost certainly innocent.

See It

Adjust the database size and test accuracy below, and watch how the probability of guilt changes. Pay attention to what happens when you search larger populations:

100,000
99.9%
0.1%
100
Expected false matches
101
Total matches
false matches + 1 true perpetrator
1.0%
P(guilty | match)
What the jury actually needs
True perpetrator False match (innocent) No match

Simulate the Database Search

Click below to simulate actual database searches. Each run tests every person in the database and reports how many matches are found. You'll see that most matches are false positives: innocent people who happened to trigger the test.

100,000
0.1%
0
Searches run
--
Avg matches per search
--
Avg false matches
--
Would convict innocent
Run a search to see results.

Why This Happens

The core issue is base rates. In a large database, the vast majority of people are innocent. Even a very accurate test, applied to a mostly-innocent population, will produce more false positives than true positives, because there are so many more innocent people to falsely match.

Think of it this way: if 1 person in 100,000 is guilty and the test is 99.9% accurate, the test correctly identifies the guilty person almost every time. But it also incorrectly flags ~100 innocent people. The true match is buried in a haystack of false ones.

This is Bayes' theorem in action. The probability of guilt given a match depends not just on the test's accuracy, but on how rare guilt is in the tested population. The rarer the thing you're looking for, the more false positives dominate.

Real Cases

The Transferable Insight

The prosecutor's fallacy isn't just a courtroom problem. It appears whenever a rare condition is tested for in a large population:

The habit worth building: whenever a test result seems definitive, ask how many people were tested and how common the thing being tested for actually is. A match from a test is the beginning of an investigation, not the end of one. The rarer the condition and the larger the population, the more skeptical you should be of any single positive result.