Law 38 · Architecture & Operations

The Multi-Agent Tax

Every extra agent multiplies your token bill, so make sure the task can pay it.

The principle

A multi-agent research system can burn roughly 15 times the tokens of a single chat, and token usage alone can explain most of the difference in performance. So multi-agent only makes economic sense when the task is high value and the work genuinely parallelizes. For most tightly coupled work, the coordination overhead isn't worth it.

Why it happens

Multi-agent systems can work, but they are expensive in tokens, latency, and coordination. They pay off when the task is high-value and naturally parallel: independent research threads, separate files, separable hypotheses. They struggle when the work is tightly coupled and sequential, because agents wait on each other while duplicating context. There is also a reliability cost: each handoff can drop constraints or split state. Use multiple agents when the work fans out cleanly. For a narrow sequential task, one well-scoped agent is often cheaper and safer.

Watch for

The work is sequential or tightly coupled, so sub-agents mostly wait on each other rather than running in parallel.
Token cost has jumped severalfold after splitting into multiple agents with no measurable quality improvement.
Sub-agents make conflicting decisions because each sees only a fragment of the shared context.

In practice

Impressed by a coordinator-and-subagents demo, you refactor your invoice-processing pipeline into five specialist agents that chat to reach consensus. The work is tightly sequential, so they mostly wait on each other while your token bill jumps roughly fifteen-fold for output no better than one well-prompted pass. Multi-agent only earns its keep when the task is high-value and genuinely parallelizes, like fanning out independent research threads. For tightly-coupled work, the coordination overhead is pure tax: keep it a single agent.

Apply it

Reserve multi-agent architectures for high-value tasks that genuinely parallelize into independent threads.
For tightly-coupled work, keep it a single well-prompted agent rather than paying the coordination tax.
If you do split, share full traces and constraints across sub-agents so they do not make conflicting decisions.

The takeaway

Reserve multi-agent setups for high-value, heavily parallelizable tasks. For everything else, the token tax outweighs the gains.

Sources and further reading

Get the audit kit Access the buyer edition Back to all 50 laws

The principle

Why it happens

Watch for

Apply it

Sources and further reading

Related laws