Tool-use design pattern: cheatsheet

The one idea

The model picks tools and fills arguments from descriptions alone. It cannot see the code. A tool the model misuses is almost always a tool you described badly.

The four parts of a tool definition

Part	Job for the model
Name	Short handle (`get_weather`); the first and shortest description.
Description	What it does AND when to use it (and when not to). The model leans on this most.
Parameters	Each input with name, type, and its own short description (format + rules).
Expected output	What comes back, so the model can plan the next step.

All four are text the model reads. The definition is the model’s only window onto the tool.

Bad vs good description

BAD:  name: search        description: "Searches."          parameters: query
GOOD: name: search_internal_docs
      description: "Search the company knowledge base of support articles
        and policies. Use for company-specific procedures/products/policies.
        Do not use for general world knowledge or live data."
      parameters: query (string): "User's question as search keywords."

Same tool, same code. Only the words changed, and the words are what the model acts on.

Parameters need descriptions

WEAK:   date            # model guesses the format
STRONG: date (string): "Target date in YYYY-MM-DD. Resolve relative dates
                         like 'tomorrow' before calling."

Undescribed parameters cause wrong-argument bugs. Describe format + rules.

Make outputs legible

The model reads the result to choose its next step (L2’s decide step). Return labeled, self-describing data.

HARD:  { "t": 58, "c": 3 }
EASY:  { "high_f": 58, "condition": "rain" }

Same data; only the second lets the model act without guessing.

Disambiguating overlapping tools

When two tools could match the same request, each description must mark its boundary.

get_current_weather(city)  "Conditions right now. Use for 'what is it like now'."
get_forecast(city, days)   "Predicted future conditions, up to 7 days. Use for
                            'will it rain tomorrow'. Not for right now."

Negative guidance (high-leverage)

Models over-reach to neighboring cases. A short “do not use this for X” line closes the near-misses that positive descriptions leave open. One of the best sentences you can add.

As the toolbox grows

3 tools: rough descriptions usually fine. 30 tools: overlapping territory is everywhere, and description quality is what keeps selection correct. Unreliable many-tool agent? Check the descriptions first, not the model.

Pitfalls to dodge

Blaming the model for selection errors (it picks from descriptions).
Vague names: search, process, handle, do_task.
Bare, undescribed parameters.
Only-positive descriptions (no “do not use for X”).
Two tools that overlap silently.

Words to use precisely

Tool definition: name + description + parameters + expected output; all text the model reads.
Negative guidance: an explicit “do not use this tool for X” clause.
Disambiguation: writing each overlapping tool’s description so its boundary with the others is explicit.