Law 29 · Evaluation & Measurement

Goodhart's Trap

When your eval becomes the goal, it stops measuring what you cared about.

Diagram explaining Goodhart's Trap

The principle

When a measure becomes a target, it stops being a good measure. Optimize hard against any single metric and the agent learns to game its surface form, padding answers to please a verbosity-biased judge or overfitting a fixed eval set, while the underlying capability stalls or even slips. The number goes up. The thing you cared about doesn't.

Why it happens

An eval is a proxy for what you care about. Optimize against one proxy hard enough and the system learns the cheapest way to raise that score: longer answers for a verbosity-biased judge, format mimicry for a rubric, or memorized quirks in a fixed test set. Reward-hacking research shows this is a deep problem with narrow objectives, not a failure of cleverness. A rotating held-out set helps with memorization, but it does not fix a bad proxy. Use fresh cases, diverse signals, and human reality checks before believing the gain.

Watch for

In practice

You optimize a prompt against the same 200-case eval for a sprint, and the score climbs from 82% to 94%. Then users complain the agent feels worse. The system learned the surface of the test: longer answers, cleaner formatting, and patterns your judge rewards, while the underlying capability barely moved. Treat any metric you push on as suspect. Keep fresh held-out cases, compare against different signals, and re-validate on examples the optimizer never saw.

Apply it

  1. Keep a rotating, held-out eval the optimization loop never sees, and re-validate gains on it.
  2. Treat any metric you actively optimize as compromised and cross-check against fresh data.
  3. Watch for surface-form gaming such as padding or format-matching, and penalize it explicitly.

The takeaway

Treat any metric you actively optimize as suspect. Keep fresh held-out cases, cross-check against different signals, and re-validate your gains on examples the optimizer never saw.

Sources and further reading

Related laws

Get the audit kit Access the buyer edition Back to all 50 laws