Skip to content

Cheatsheet: Giving agents memory

An agent is amnesiac by default. Memory is designed in, not automatic. It lives in two very different places: inside one run, and between runs.

“Memory” is reused across AI. It can mean a neural network’s internal running state, or the context window a model reads in one pass. This lesson means a third thing: information an agent carries from one run to the next.

Short-term contextPersistent memory
ScopeOne runAcross runs
HoldsConversation so far, recent tool results, the scratchpadPreferences, learned facts, conversation summaries
LifespanGone when the run endsSurvives until changed or expired
Feels likeWorking notes on a phone callAn assistant that knows you
USER: Book my usual sync with Sarah.
WITHOUT: "How long? Which day? Recurring?" (never heard of your "usual")
WITH: reads stored fact (30 min, Tue PM) -> books it directly

The only difference is whether a fact about you survived from an earlier run.

Not everything. “Remember everything” fails on three counts:

  • Context cost: what you use must load into the finite context window; noise crowds it and adds cost/latency.
  • Staleness: stored facts go out of date.
  • Privacy: persistent memory is stored personal data; more than you need is a liability.
KEEP "prefers email over phone" (durable preference)
KEEP "account ID 4471" (stable, reusable)
DISCARD "it's raining today" (transient)
DISCARD "thanks, have a good one" (noise)

Decision rule: not “can I store this” but “will this be worth having next time.”

run start -> load relevant persistent memory into short-term context
run middle -> work within that context
run end -> write anything newly worth keeping back to persistent memory

The agent feels continuous because each run reloads what mattered from the last.

  • Mistaking the context window for the agent’s memory (it is short-term only).
  • Remembering everything (bloat, cost, staleness).
  • Never expiring anything (durable is not permanent).
  • Ignoring memory’s privacy weight (it is stored personal data).
  • Assuming memory is automatic (the loop forgets between runs by default).
  • Short-term context / working memory: information present during a single run; discarded when it ends.
  • Persistent memory: information that survives across runs (preferences, facts, summaries).
  • Retention decision: choosing what is worth keeping, by future usefulness, not by what is available.