Law 15 · Retrieval & Memory

Memory Is a System, Not a Window

Give the agent a hierarchy, not just a bigger prompt.

The principle

Think of the context window like a computer's RAM. The agent should actively move information between a small in-context working set and large external storage, deciding what to keep, what to evict, and what to recall. Cramming everything into one flat window mixes up working memory with long-term storage and hits hard limits fast. Durable memory needs explicit tiers and self-managed retrieval.

Why it happens

A bigger prompt is not a memory system. It is an expensive pile of tokens that eventually dilutes attention and repeats old history every turn. Durable memory needs tiers: a small working set in context, summaries or records outside it, and rules for what to recall. MemGPT treats the context window like RAM and pages information in and out. Generative-agent work adds another lesson: recall is a ranking problem, using recency, importance, and relevance. The system needs storage plus policy, not just more room.

Watch for

A long-running session degrades over time, forgetting earlier decisions as history accumulates.
Cost and latency climb every turn because the full history is re-sent into the prompt.
The plan for memory growth is a bigger context window rather than eviction and external storage.

In practice

Your agent's long-running session keeps degrading: by hour two it is forgetting decisions from hour one because you have been appending everything into one ever-growing prompt until attention spreads thin and costs balloon. A bigger context window just delays the same wall. Build memory in tiers instead: a small working set in context, summarized recallable notes, and an external store the agent reads and writes deliberately, with explicit policies for what gets promoted, summarized, and evicted. Treat the window like RAM, not a filing cabinet.

Apply it

Separate a small in-context working set from a large external store and page entries between them deliberately.
Define explicit policies for what gets promoted, summarized, and evicted rather than appending everything.
Rank what to recall back into context by a blend of recency, importance, and relevance to the current task.

The takeaway

Build memory in tiers: working context, recallable summaries, and external stores, each with clear rules for what gets promoted or evicted. Don't lean on raw context length to do the job.

Sources and further reading

Get the audit kit Access the buyer edition Back to all 50 laws

The principle

Why it happens

Watch for

Apply it

Sources and further reading

Related laws