Goodhart's Law
In 2016, YouTube changed its recommendation algorithm to optimize for watch time instead of clicks. The logic was sound: watch time seemed like a better proxy for "did the user actually enjoy this?" than whether they clicked on a thumbnail.
Watch time went up. Way up. But something else happened: the algorithm learned that conspiracy theories, outrage, and "rabbit hole" content kept people watching longer than anything else. Users weren't enjoying the experience more. Many reported feeling worse. But they couldn't stop watching. The metric improved. The thing the metric was supposed to represent got worse.
This pattern has a name: Goodhart's Law.
Named after economist Charles Goodhart, who observed in 1975 that any statistical regularity tends to collapse once pressure is placed on it for control purposes. Marilyn Strathern later generalized it to the punchier version above.
How It Works
Every organization needs to measure things. You can't manage what you can't measure, as the saying goes. So you find a metric that correlates with the outcome you actually care about, and you optimize for it.
The problem: correlation is not identity. The metric is a proxy for the real goal, and there's always a gap between the two. When you push hard enough on the proxy, the system finds ways to improve the metric without improving the thing you care about, or even while making it worse.
The optimizer doesn't need to be malicious. It doesn't even need to be a person. Algorithms, incentive structures, and natural selection all do this. The system mechanically exploits the gap between the proxy and the true goal.
The Simulation
Pick a scenario below, then watch what happens as a system optimizes for a proxy metric over time. The two lines, the proxy and the true goal, start moving together and then diverge. The harder you optimize, the faster and wider the gap.
Social Media Platform
True goal: user satisfaction
Proxy: engagement (time on site)
Gaming: outrage and anxiety keep people scrolling
Software Team
True goal: code quality
Proxy: lines of code written
Gaming: verbose, redundant code; copy-paste over abstraction
Education
True goal: deep learning
Proxy: standardized test scores
Gaming: teaching to the test; memorization over understanding
Customer Support
True goal: customer satisfaction
Proxy: average handle time
Gaming: rushing calls; transferring instead of solving
Hospital Performance
True goal: patient health outcomes
Proxy: mortality rate
Gaming: refusing high-risk patients to keep numbers low
Policing
True goal: community safety
Proxy: arrest numbers
Gaming: targeting easy, minor offenses; ignoring complex cases
Explore the Dynamics
Try adjusting the three sliders and re-running:
- Optimization pressure: How aggressively the system pushes on the metric. Low pressure = gradual; high pressure = aggressive. Notice that higher pressure makes the proxy climb faster but also makes the divergence happen earlier and wider.
- Proxy-goal alignment: How well-chosen the metric is. A 99% aligned proxy resists Goodhart's Law longer. A 50% aligned proxy diverges almost immediately. In practice, most proxy metrics start around 70-85% aligned.
- Gameability: How easy it is for the system to improve the metric without improving the goal. Watch time is highly gameable (outrage hacks exist). Patient survival rates are less gameable (but still: refuse risky patients). Harder-to-game metrics delay the divergence but don't prevent it.
Run It Many Times
One run might look like a smooth divergence. But in practice, systems are noisy. Run the simulation many times below and see the distribution of outcomes. Even with the same settings, the divergence point and final gap vary, which makes it even harder to detect in the real world.
Why It's So Hard to Fix
Goodhart's Law is pernicious because the people inside the system often can't see it happening:
- The metric is going up. Dashboards are green. Reports look great. If anyone suggests the proxy isn't working, the numbers disagree.
- The true goal is unmeasured. That's why you picked a proxy in the first place. If you could measure the real thing directly, you wouldn't need the proxy. So there's no dashboard turning red.
- Incentives are aligned to the proxy. People get promoted, funded, and rewarded based on the metric. Anyone who questions the metric is questioning the basis for everyone's success.
- The divergence is gradual. It doesn't happen overnight. It's a slow drift, easily rationalized at each step. "Engagement is up 3% this quarter" feels like good news every single time.
What Can You Do?
Goodhart's Law can't be eliminated. The gap between proxy and goal is inherent in the act of measurement. But it can be managed:
- Use multiple metrics. A single metric is easy to game. A basket of metrics that capture different facets of the goal is harder. If one metric improves while others decline, that's a signal.
- Rotate metrics. Changing which metric you optimize for prevents the system from fully adapting to game any single one.
- Measure the gap. Periodically check whether the proxy still correlates with the true goal. If the correlation is decaying, the proxy is being gamed.
- Keep qualitative checks. Not everything that matters can be quantified. Regularly ask the humans in the system: "Is this actually getting better?" Their answers are noisy but hard to game.
- Be suspicious of metrics that only go up. In a complex system, sustained monotonic improvement in a single metric is often a sign that the metric has decoupled from reality, not that reality is improving.
The Transferable Insight
Goodhart's Law is about the limits of measurement. Every metric is a model of reality, and every model is a simplification. When you push on the simplification hard enough, it breaks. The system finds the gap between the map and the territory.
The habit worth building: whenever you see a metric being used to drive decisions, ask what's not being captured. The most dangerous metric isn't one that's wrong. It's one that's almost right. It earns your trust during the easy phase, then leads you astray once the system learns to game it.
This is one of the deepest problems in AI alignment. A sufficiently powerful AI optimizing for a proxy of human values will find ways to satisfy the proxy that humans never intended. Not out of malice, but out of the same mechanical gap between "what we measured" and "what we meant." Goodhart's Law doesn't care whether the optimizer is a person, an institution, or an algorithm. The gap is the gap.