Giving agents memory: cheatsheet

The one idea

An agent is amnesiac by default. Memory is designed in, not automatic. It lives in two very different places: inside one run, and between runs.

A word-sense note

“Memory” is reused across AI. It can mean a neural network’s internal running state, or the context window a model reads in one pass. This lesson means a third thing: information an agent carries from one run to the next.

The two kinds

	Short-term context	Persistent memory
Scope	One run	Across runs
Holds	Conversation so far, recent tool results, the scratchpad	Preferences, learned facts, conversation summaries
Lifespan	Gone when the run ends	Survives until changed or expired
Feels like	Working notes on a phone call	An assistant that knows you

With vs without persistent memory

USER: Book my usual sync with Sarah.

WITHOUT: "How long? Which day? Recurring?"   (never heard of your "usual")
WITH:    reads stored fact (30 min, Tue PM) -> books it directly

The only difference is whether a fact about you survived from an earlier run.

What to retain (the hard part)

Not everything. “Remember everything” fails on three counts:

Context cost: what you use must load into the finite context window; noise crowds it and adds cost/latency.
Staleness: stored facts go out of date.
Privacy: persistent memory is stored personal data; more than you need is a liability.

KEEP    "prefers email over phone"   (durable preference)
KEEP    "account ID 4471"            (stable, reusable)
DISCARD "it's raining today"         (transient)
DISCARD "thanks, have a good one"    (noise)

Decision rule: not “can I store this” but “will this be worth having next time.”

How the two work together

run start  -> load relevant persistent memory into short-term context
run middle -> work within that context
run end    -> write anything newly worth keeping back to persistent memory

The agent feels continuous because each run reloads what mattered from the last.

Pitfalls to dodge

Mistaking the context window for the agent’s memory (it is short-term only).
Remembering everything (bloat, cost, staleness).
Never expiring anything (durable is not permanent).
Ignoring memory’s privacy weight (it is stored personal data).
Assuming memory is automatic (the loop forgets between runs by default).

Words to use precisely

Short-term context / working memory: information present during a single run; discarded when it ends.
Persistent memory: information that survives across runs (preferences, facts, summaries).
Retention decision: choosing what is worth keeping, by future usefulness, not by what is available.