Law 11 · Retrieval & Memory

Retrieval Is the Ceiling

Missing evidence becomes a missing answer.

The principle

For facts the model doesn't already know well, the answer can only be as good as the evidence you retrieve. If the right passage never reaches the context, the generator fills the gap from memory and guesswork. Retrieval quality sets the practical ceiling for any grounded answer.

Why it happens

RAG only grounds the model in passages that actually reach the prompt. If the answer-bearing passage is missing, the model can still answer from parametric memory, but that memory is lossy and may be out of date. The failure then looks like a generation problem even though the evidence never arrived. Measure retrieval directly: did the needed passage appear in the top-k, and was it ranked high enough to use? For grounded facts outside the model's reliable memory, retrieval recall sets the practical ceiling.

Watch for

Upgrading to a stronger generation model barely moves end-to-end accuracy on factual questions.
You have never measured whether the answer-bearing passage appears in the retrieved set.
Wrong answers are fluent and confident rather than hedged or empty, suggesting the model is filling a gap.

In practice

You swap in a smarter model to fix wrong support answers, and accuracy barely moves because the refund-policy chunk never reached the top-k. The generator was filling a missing-evidence gap. Before touching prompts or models, log recall@k on labeled questions: did the answer-bearing passage appear, and was it ranked high enough to matter? If not, fix chunking, query expansion, or ranking first. Better generation cannot reliably ground an answer in evidence it never saw.

Apply it

Build a labeled set of queries with known answer passages and measure recall at k before touching prompts or models.
Treat any answer whose supporting evidence was never retrieved as a retrieval failure, not a generation failure.
Fix recall first by tuning chunking, query expansion, and k, then optimize the generator only once evidence reliably lands in context.

The takeaway

Measure retrieval before you touch prompts or models. If the passage that holds the answer isn't showing up, fix recall, chunking, ranking, or query expansion first.

Sources and further reading

Get the audit kit Access the buyer edition Back to all 50 laws

The principle

Why it happens

Watch for

Apply it

Sources and further reading

Related laws