What makes an AI an 'agent': cheatsheet

The definition (memorize this)

An agent is a model wrapped in a loop: it looks at a goal, decides whether a tool would help, calls the tool, reads the result, and repeats until the goal is met or it gives up.

The perceive-decide-act loop

Move	What happens
Perceive	Read the current state: the goal plus results from previous actions.
Decide	Choose the next move (usually “call tool X with these inputs”, sometimes “answer now”).
Act	Run the action, observe the result, feed it back into the next Perceive.

The loop repeats until the goal is met or the agent stops. The repetition is the whole difference from a chatbot.

Chatbot vs agent

	Chatbot	Agent
Shape	Text in, text out, once	Model in a loop
Tools	None	Yes (the doors to the world)
Iterates	No	Yes, until the goal is done
Weather question	”I cannot check live data”	Calls a weather tool, reads it, answers

The four-part anatomy

Part	Job
Model	The decision-maker. Predicts the next move from the current state.
System prompt	Tells the model what it is doing and that it may call tools, plus the call format.
Tools	Functions it can invoke (read_file, query_db, send_email, search_web).
Loop	Outside code that runs each tool call and feeds the result back to the model.

Key idea: the agency lives in the scaffolding (prompt + tools + loop); the intelligence lives in the model. Swap the model and the agent keeps its shape; only decision quality changes.

Loop trace (multi-step task)

Goal: “Book a 30-min meeting with Sarah next week.”

PERCEIVE goal -> DECIDE need calendar -> ACT check Sarah -> observe free slots
PERCEIVE -> DECIDE check my calendar -> ACT check me -> observe a conflict
PERCEIVE -> DECIDE pick the clear slot -> ACT send invite -> observe accepted
PERCEIVE -> DECIDE goal met -> ACT reply to user

Four passes, two tools, one self-correction. No single step is clever; the power is the repetition.

A bit of history (why now)

Agents predate language models: ELIZA (1960s, rule-based chat), STRIPS (1970s, symbolic planning), Brooks’ subsumption (1980s, reactive robotics). What changed: the “decide” step moved from hand-coded rules and explicit search to decisions a language model generates from training. Brittle logic became flexible language.

When an agent is the right tool

Reach for an agent when the task is:

Open-ended: the steps cannot be fully scripted in advance.
Multi-step: several tool calls across several turns, not one lookup.
Improvable: you want it to use results and feedback to do better as it goes.

A single fixed lookup or one-shot generation is usually better as a plain model call. An agent costs latency, money, and reliability; spend that cost only when the task needs it.

Pitfalls to dodge

Thinking the model “is” the agent. It is one of four parts.
Believing a separate “agent model” exists. Agency is a property of the system, not a kind of model.
Confusing more autonomy with more intelligence. Longer loop and more tools is not “smarter.”
Assuming every task wants an agent. For a single fixed step, a plain call is faster and more reliable.

Words to use precisely

Agent: a model running inside a loop with tools, steering toward a goal.
Tool: a function the model can invoke to act on or read from the world.
System prompt: the instructions that, among other things, grant tool use and define the call format.
Loop: the outside code that executes tool calls and returns results to the model.
Single model call: one text-in, text-out exchange with no loop and no tools (a chatbot turn).