Function calling, in brief

What you’ll learn

This is lesson 3 of Phase 6, How models reason and act, in Track 5 (AI Foundations). The previous lesson on RAG covered how a model fetches unstructured text it doesn’t have in its weights. This lesson covers the structured-data sibling: function calling. When you ask a chat app “find a teddy bear store near me,” the bare LLM has no idea where you are or what stores are open. A function-calling LLM recognizes that the question requires fresh, structured information, picks a predefined function, fills in the right arguments, and emits a structured call. Code outside the LLM runs the call; the result comes back; the LLM produces a natural-language answer that wraps the structured data. That is function calling. This lesson walks the three-stage mechanism (tool prediction by the LLM, function execution by code, response formatting by the LLM), explains what the model sees and doesn’t see (the docstring; not the implementation), covers the two SFT training pairs that produce a function-calling model, and flags the common failure modes (argument hallucination, wrong-tool selection). Course materials are at cme295.stanford.edu.

Where this fits

This is lesson 3 of Phase 6, How models reason and act. The previous lesson (How RAG works) covered the unstructured-data version of “give the model what it needs at inference time.” This lesson covers the structured-data version: function calling. The next lesson (How agent loops work) chains multiple function calls into longer-horizon work, and adds tool-selection patterns when more than one tool is available. After that, Phase 6 is complete and Phase 7 picks up.

Before you start

Prerequisites: the RAG lesson is required. We assume you understand that an LLM at inference time has access only to its prompt context (plus its frozen weights), and that providing additional information in the prompt is how to close knowledge gaps. The reasoning models lesson is useful but not strictly required.

By the end, you’ll be able to

Distinguish function calling from RAG by the kind of data each closes a gap for (structured vs unstructured)
Walk through the three-stage mechanism (tool prediction by the LLM, function execution by code, response formatting by the LLM)
Identify what the LLM sees (function signature and docstring, conversation history) and what it does not see (the implementation)
Describe the two SFT training pairs that typically go into a function-calling model (tool prediction and response formatting)
Recognize the common failure modes (argument hallucination, wrong-tool selection) and where to look when debugging

Time and difficulty

Read time: about 13 minutes
Practice time: about 12 minutes (a self-check on the three-stage mechanism, a hands-on exercise tracing a function-call flow on a worked example, and flashcards)
Difficulty: standard