LLM agents: brief

What you’ll learn

Lesson 4 introduced tool use as a four-step loop. This lesson is the deep dive on what happens when you let the model decide whether to take another round. That tiny shift, “the model picks the next call until it picks a final answer,” is the entire content of the agent topic; everything else (patterns, failure modes, build practices) falls out of it. The source curriculum is the agents session of the Full Stack Deep Learning LLM Bootcamp (Spring 2023), with Harrison Chase (LangChain) as the guest instructor, freely available at fullstackdeeplearning.com/llm-bootcamp.

You will state the minimum useful definition of an LLM agent (the L4 tool-use loop with the model deciding when to stop); identify the three foundational patterns (function-calling agents as the 2026 default, ReAct as the predecessor still in the literature, plan-and-execute when you want to verify intent before action); apply the three tests for whether a task should be an agent at all (variable shape + real bounded tools + acceptable cost and latency, all yes); name the five engineering failure modes and their specific mitigations (loops, wrong paths, compound cost, harder evaluation, brittle tool boundaries); and apply trajectory-level evaluation and observability as the scaled-up form of the lesson 7 LLMOps discipline.

§6 framing note: taught at a strictly technical-primer level, same discipline as Track 14 lesson 12 and Track 15 lesson 14, and the same discipline this track has applied across lessons 6, 7, and 9. WHAT an agent is, WHEN to reach for one, WHAT goes wrong, HOW to build one. Out of scope: agent autonomy, agent safety, agent alignment debates, contested safety claims, what agents should or should not be allowed to do, sector-specific compliance for agent deployment. Real and important; they belong in their own forum with the right stakeholders (legal, policy, ethics, security).

Where this fits

This is lesson 10 of 11, the third lesson of Phase 3 (advanced and the field). It is the deep dive on the tool-use side of lesson 4, the third lesson in this track held under the §6 technical-primer discipline (after lessons 6 and 7), and it threads back to lesson 2 (the three productive limits all multiply in agents), lesson 7 (LLMOps as the discipline that scales here), and lesson 8 (the build-vs-buy mix architecture inside an agent). Track 14 lesson 12, Track 15 lesson 14, and the full Track 20 are the companion treatments from the using-side, build-side, and dedicated-track-side; this lesson is the production-shipping primer.

Before you start

Prerequisites: lesson 9 of this track (the deep dive on the fine-tune point of the spectrum; sequential order) and lessons 4 and 7 (the tool-use loop and the LLMOps discipline this lesson scales). Track 14 lesson 12 and Track 20 are direct topical companions; helpful but not required.

About the math

Light. One back-of-envelope cost calculation that shows agents scale closer to (steps)² than (steps) × single-call cost because context grows each step. No derivations, no theory. The decision-making here is criterion-based (the three tests) and practice-based (the five failure modes), not mathematical.

By the end, you’ll be able to

The single capability this lesson builds: decide whether a production task should be an agent, build a function-calling agent if it should, evaluate and operate it like the rest of your LLM application. Concretely, you will be able to:

State the minimum definition of an LLM agent (L4 loop + model decides when to stop)
Identify the three foundational patterns (function-calling, ReAct, plan-and-execute) and when each applies
Apply the three tests for “should this be an agent?” (variable shape + bounded tools + acceptable cost)
Name the five engineering failure modes and their mitigations
Apply trajectory-level evaluation and observability (lesson 7 discipline scaled)

Time and difficulty

Read time: about 14 minutes
Practice time: about 12 minutes (agent-or-not on four scenarios + the cost-of-a-6-step-agent calculation, plus flashcards)
Difficulty: standard (no math beyond arithmetic; the work is internalizing the loop, the three tests, the five failure modes, and the operational discipline)