Cheatsheet: The tool-use design pattern in depth
The one idea
Section titled “The one idea”The model picks tools and fills arguments from descriptions alone. It cannot see the code. A tool the model misuses is almost always a tool you described badly.
The four parts of a tool definition
Section titled “The four parts of a tool definition”| Part | Job for the model |
|---|---|
| Name | Short handle (get_weather); the first and shortest description. |
| Description | What it does AND when to use it (and when not to). The model leans on this most. |
| Parameters | Each input with name, type, and its own short description (format + rules). |
| Expected output | What comes back, so the model can plan the next step. |
All four are text the model reads. The definition is the model’s only window onto the tool.
Bad vs good description
Section titled “Bad vs good description”BAD: name: search description: "Searches." parameters: queryGOOD: name: search_internal_docs description: "Search the company knowledge base of support articles and policies. Use for company-specific procedures/products/policies. Do not use for general world knowledge or live data." parameters: query (string): "User's question as search keywords."Same tool, same code. Only the words changed, and the words are what the model acts on.
Parameters need descriptions
Section titled “Parameters need descriptions”WEAK: date # model guesses the formatSTRONG: date (string): "Target date in YYYY-MM-DD. Resolve relative dates like 'tomorrow' before calling."Undescribed parameters cause wrong-argument bugs. Describe format + rules.
Make outputs legible
Section titled “Make outputs legible”The model reads the result to choose its next step (L2’s decide step). Return labeled, self-describing data.
HARD: { "t": 58, "c": 3 }EASY: { "high_f": 58, "condition": "rain" }Same data; only the second lets the model act without guessing.
Disambiguating overlapping tools
Section titled “Disambiguating overlapping tools”When two tools could match the same request, each description must mark its boundary.
get_current_weather(city) "Conditions right now. Use for 'what is it like now'."get_forecast(city, days) "Predicted future conditions, up to 7 days. Use for 'will it rain tomorrow'. Not for right now."Negative guidance (high-leverage)
Section titled “Negative guidance (high-leverage)”Models over-reach to neighboring cases. A short “do not use this for X” line closes the near-misses that positive descriptions leave open. One of the best sentences you can add.
As the toolbox grows
Section titled “As the toolbox grows”3 tools: rough descriptions usually fine. 30 tools: overlapping territory is everywhere, and description quality is what keeps selection correct. Unreliable many-tool agent? Check the descriptions first, not the model.
Pitfalls to dodge
Section titled “Pitfalls to dodge”- Blaming the model for selection errors (it picks from descriptions).
- Vague names:
search,process,handle,do_task. - Bare, undescribed parameters.
- Only-positive descriptions (no “do not use for X”).
- Two tools that overlap silently.
Words to use precisely
Section titled “Words to use precisely”- Tool definition: name + description + parameters + expected output; all text the model reads.
- Negative guidance: an explicit “do not use this tool for X” clause.
- Disambiguation: writing each overlapping tool’s description so its boundary with the others is explicit.