Skip to content

Agents that retrieve their own information: agentic RAG

This is lesson 6 of Track 20 (AI Agents and Tool Use) and the third lesson of Phase 2, The design patterns that make agents work. It is about how an agent answers from a body of text it was never trained on: your documents, a manual, a knowledge base. The standard technique is RAG, retrieval-augmented generation, and this lesson is about a sharper version of it.

The ordinary form of RAG is a fixed pipeline: retrieve the relevant passages, read them, answer, the same way every time. The agentic form makes one change that turns that rigid pipeline into a reasoning loop: retrieval becomes a tool the agent decides whether and when to call. That single change lets the agent skip retrieval when it does not need it, retrieve more than once for multi-part questions, and judge whether a result is good enough before answering. The lesson builds the contrast on a worked example, names what the added adaptability costs, and shows that nothing here is new machinery; it is the loop, the tool call, the tool definition, and memory, all pointed at the job of fetching information.

The track structurally mirrors Microsoft’s “AI Agents for Beginners” (MIT-licensed), with the Berkeley CS294 LLM Agents course as a depth reference. Full attribution is in this lesson’s references.

This lesson sits in the middle of Phase 2, where the track works through the patterns that make agents capable. It builds directly on tool use (the agent emitting a tool call and reading the result) and on memory (where the agent’s reference material lives). Retrieval is how that stored material gets pulled into a run, so this lesson is the natural next step after memory. The next lesson, Planning: breaking a goal into steps, moves up a level: so far the agent has decided one move at a time, including whether to retrieve now, and planning is deciding an ordered sequence of steps before acting, for tasks too large to solve one reaction at a time.

Prerequisites: the earlier Phase 1 and Phase 2 lessons, especially How tool use turns a model into an agent (this lesson treats retrieval as one more tool the agent calls) and Giving agents memory (retrieval is often how persistent memory is fetched). You do not need to code, and you do not need to understand how retrieval matches meaning under the hood; this lesson treats that as a black box on purpose. If you know what it means for an agent to call a tool and read the result, you have the background this lesson assumes.

  • Describe what RAG does and why classical RAG is a fixed retrieve-read-answer pipeline
  • Explain the single change that turns classical RAG into agentic RAG (retrieval becomes a tool the agent decides to call)
  • Name the three behaviors agentic RAG unlocks that a fixed pipeline cannot do
  • Judge when a job needs agentic retrieval versus when a fixed pipeline is the better fit
  • Recognize that agentic RAG reuses earlier pieces (the loop, the tool call, the tool definition, memory) rather than adding new machinery
  • Read time: about 10 minutes
  • Practice time: about 15 minutes (a self-check, two applied design exercises, and flashcards)
  • Difficulty: standard