Law 09 · Reasoning & Planning
Stop Tuning, Start Scaling
Build scaffolding you would gladly delete.

The principle
The Bitter Lesson isn't a ban on structure. It's a warning against hand-coded cleverness that quietly becomes a ceiling. Use code where you need guarantees and thin scaffolds for today's weak spots, but keep asking whether a simpler, more model-driven version now works better.
Why it happens
The Bitter Lesson warns against clever structure that locks in yesterday's assumptions. It does not say to remove all structure. Code should still enforce schemas, permissions, retries, and other guarantees. The risky part is bespoke prompt machinery that exists only to compensate for a temporary model weakness. As models improve, those chains and heuristics can become the bottleneck. Keep the scaffold thin, benchmark it against a simpler baseline, and be willing to delete it when the model can handle the task with less help.
Watch for
- A new model release makes your hand-tuned chain the bottleneck rather than an improvement.
- Most of your effort goes into encoding heuristics the model could plausibly infer itself.
- A plain here are the tools, decide baseline matches or beats your elaborate scaffolding.
In practice
You spend two weeks hand-building a 40-node routing tree to help a weaker model triage tickets. It works for a while, then a newer model with a simpler tool prompt matches it and is easier to maintain. The lesson is not to remove all structure; validation and permissions still belong in code. The lesson is to keep temporary scaffolding thin and deletable. Re-test the simple baseline as models improve, and remove the custom chain when it stops earning its complexity.
Apply it
- Use deterministic code for guarantees, not for hand-encoding every judgment.
- Build the thinnest scaffold that works and that you would happily delete when the model improves.
- Periodically re-test a minimal-scaffold baseline against your tuned pipeline as models advance.
The takeaway
Prefer the thinnest scaffold that works. Keep deterministic boundaries where they protect a guarantee, and delete your bespoke prompt chains or heuristics once the model no longer needs them.