Skip to content

Cheatsheet: What makes an AI an "agent"

An agent is a model wrapped in a loop: it looks at a goal, decides whether a tool would help, calls the tool, reads the result, and repeats until the goal is met or it gives up.

MoveWhat happens
PerceiveRead the current state: the goal plus results from previous actions.
DecideChoose the next move (usually “call tool X with these inputs”, sometimes “answer now”).
ActRun the action, observe the result, feed it back into the next Perceive.

The loop repeats until the goal is met or the agent stops. The repetition is the whole difference from a chatbot.

ChatbotAgent
ShapeText in, text out, onceModel in a loop
ToolsNoneYes (the doors to the world)
IteratesNoYes, until the goal is done
Weather question”I cannot check live data”Calls a weather tool, reads it, answers
PartJob
ModelThe decision-maker. Predicts the next move from the current state.
System promptTells the model what it is doing and that it may call tools, plus the call format.
ToolsFunctions it can invoke (read_file, query_db, send_email, search_web).
LoopOutside code that runs each tool call and feeds the result back to the model.

Key idea: the agency lives in the scaffolding (prompt + tools + loop); the intelligence lives in the model. Swap the model and the agent keeps its shape; only decision quality changes.

Goal: “Book a 30-min meeting with Sarah next week.”

PERCEIVE goal -> DECIDE need calendar -> ACT check Sarah -> observe free slots
PERCEIVE -> DECIDE check my calendar -> ACT check me -> observe a conflict
PERCEIVE -> DECIDE pick the clear slot -> ACT send invite -> observe accepted
PERCEIVE -> DECIDE goal met -> ACT reply to user

Four passes, two tools, one self-correction. No single step is clever; the power is the repetition.

Agents predate language models: ELIZA (1960s, rule-based chat), STRIPS (1970s, symbolic planning), Brooks’ subsumption (1980s, reactive robotics). What changed: the “decide” step moved from hand-coded rules and explicit search to decisions a language model generates from training. Brittle logic became flexible language.

Reach for an agent when the task is:

  • Open-ended: the steps cannot be fully scripted in advance.
  • Multi-step: several tool calls across several turns, not one lookup.
  • Improvable: you want it to use results and feedback to do better as it goes.

A single fixed lookup or one-shot generation is usually better as a plain model call. An agent costs latency, money, and reliability; spend that cost only when the task needs it.

  • Thinking the model “is” the agent. It is one of four parts.
  • Believing a separate “agent model” exists. Agency is a property of the system, not a kind of model.
  • Confusing more autonomy with more intelligence. Longer loop and more tools is not “smarter.”
  • Assuming every task wants an agent. For a single fixed step, a plain call is faster and more reliable.
  • Agent: a model running inside a loop with tools, steering toward a goal.
  • Tool: a function the model can invoke to act on or read from the world.
  • System prompt: the instructions that, among other things, grant tool use and define the call format.
  • Loop: the outside code that executes tool calls and returns results to the model.
  • Single model call: one text-in, text-out exchange with no loop and no tools (a chatbot turn).