Practice: Your first Claude API call

Self-check

Seven short questions. Answer each before opening the collapsible.

1. Name the one endpoint every call in this lesson hits, and the three required headers.

Show answer

The endpoint is https://api.anthropic.com/v1/messages. The three required headers are x-api-key (the API key from the environment variable ANTHROPIC_API_KEY), anthropic-version (currently 2023-06-01, the API contract date), and Content-Type: application/json.

2. Three fields are required in the request body. What are they?

Show answer

model (the precise model identifier, for example claude-opus-4-8), max-tokens (the hard ceiling on output tokens for this call), and messages (the list of message objects representing the conversation, each with a role of either user or assistant and a content field).

3. The response has a content field. Why does the lesson say you should iterate it, not index it?

Show answer

Because content is an array of blocks, not a single string. For a plain text response you get one block of type text. Later (with tool use, vision, and other features), the same call can return multiple blocks of different types. Code that does response.content[0].text works for a plain text response but breaks the first time the model returns two blocks. The right pattern is to iterate the array and dispatch on each block’s type.

4. What does stop_reason equal to end_turn mean, versus max_tokens?

Show answer

end_turn means the model finished its thought naturally and is yielding the turn back to you, the normal case for a successful response. max_tokens means the max-tokens ceiling you set was hit before the model finished, so the response is truncated; the application should either rerun with a higher cap or break the request into smaller turns. Other values you will see include stop_sequence (a custom stop string was generated) and tool_use (the model is requesting a tool call, covered in lesson 4).

5. The Messages API is stateless. What does that mean practically for code that wants to hold a multi-turn conversation?

Show answer

The API remembers nothing between calls. To have a conversation, your code sends the entire conversation history with every request as the messages list. The roles alternate user and assistant, ending with user (the model generates the next assistant turn). Two consequences: (a) your application owns conversation state, in memory, a database, or a serialized format; (b) every turn pays for the full prior history as input tokens, so long conversations get expensive on input even if the latest turn is short (lesson 7 addresses this with prompt caching).

6. Where do standing instructions belong in a request, and why is that distinct from a message?

Show answer

In the system parameter, separate from the messages list. The messages list is the conversation (alternating user and assistant turns); the system parameter is a sidecar that tells the model how to behave for the whole call (persona, format, constraint). The system parameter is not a message, does not get a role, and does not participate in the user-assistant alternation. Putting the system instruction as a first user message or with role system is the common cross-API mistake; with Claude, the dedicated system parameter exists for exactly this.

7. Name three common failures of a first API call and how to recognize each.

Show answer

Any three of: (a) API key not set, the call fails immediately with an authentication error because the client could not find ANTHROPIC_API_KEY in the environment; (b) wrong model name, the call fails because Anthropic requires precise identifiers like claude-opus-4-8, not colloquial names like “Opus”; (c) max-tokens too small, the response comes back successfully but truncated with stop_reason equal to max_tokens; (d) reading content as a string, the code works on a plain text response but breaks the first time the model returns two blocks; (e) treating the API as stateful, sending only the latest user message and expecting the model to remember earlier turns.

Try it yourself: send your first three calls

About 15 minutes. You will need an Anthropic Console account (free to create at https://platform.claude.com/) and an API key. Costs for the calls below will be a fraction of a cent.

Setup. Create an API key in the Console at https://platform.claude.com/settings/keys. Set it in your shell: export ANTHROPIC_API_KEY='your-key-here'. Install the Python SDK: pip install anthropic. (Or the TypeScript SDK: npm install @anthropic-ai/sdk. The three exercises work in either language.)

Call 1: the simplest possible call. Make a one-message call to claude-opus-4-8 with max-tokens of 200. Ask it any question. Print the whole response object (not just message.content). Note the id, stop_reason, and usage values.

Call 2: a multi-turn conversation. Make a single call that includes three messages: a user greeting, a synthesized assistant reply (“Hello! Happy to help.”), and a follow-up user question that builds on the synthesized assistant reply (something like “Can you explain in three sentences?”). Verify the model responds as if the synthesized reply were real prior context. This is the stateless-history pattern.

Call 3: a system parameter. Make the same call as Call 1 but add a system parameter telling the model to respond in exactly three bullet points starting with verbs. Note how the response shape changes. Try a second system instruction (be terse; answer in a haiku; respond only in code). Note that the model treats the system text as standing instruction across the whole conversation, not a one-time hint.

What you’ll get (an example, not the canonical answer)

For call 1 you will see the full response shape: id (the msg_… identifier you log), type (message), role (assistant), content (one block of type: text), model (echoed back as claude-opus-4-8), stop_reason (almost certainly end_turn for a 200-token answer), and usage (input and output token counts). For call 2 you will see the model continuing the conversation as if the synthesized assistant reply were real, demonstrating that the API treats everything in the messages array as ground truth. For call 3 you will see the model conform to the system instruction across whatever the user asks; the system parameter is the right home for “behave like X for the whole call” requirements, separate from per-turn requests.

The exercise is the value, not the exact response. Get the three working calls in your shell history; you will refer back to them in lessons 2 and 3.

Flashcards

Nine cards. Click any card to reveal the answer. Use the Print flashcards button to lay the set out one card per page for offline review.

Q. The one endpoint for Claude API calls?

https://api.anthropic.com/v1/messages. POST with JSON body. Three required headers: x-api-key (from ANTHROPIC_API_KEY), anthropic-version (2023-06-01), Content-Type: application/json.

Q. The three required request body fields?

model (precise identifier, e.g. claude-opus-4-8), max-tokens (output ceiling), messages (array of role + content objects, roles alternating user and assistant, ending with user).

Q. What is in the response object?

id (log it), type (message), role (assistant), content (array of blocks, iterate not index), model (echoed), stop_reason (check on every response), usage (input_tokens + output_tokens, the cost unit).

Q. The four most common *stop_reason* values?

end_turn (normal natural finish), max_tokens (ceiling hit, response truncated), stop_sequence (custom stop string generated), tool_use (model is requesting a tool call, lesson 4). Application code should check this on every response.

Q. The Messages API is stateless. What does that mean?

The API remembers nothing between calls. To have a conversation, your code sends the entire history as the messages list every call. Your application owns conversation state; every turn pays for the full prior history as input tokens.

Q. Where do standing instructions belong?

In the system parameter, separate from the messages list. The messages list is the conversation; system is a sidecar telling Claude how to behave for the whole call. Not a message, no role, no participation in user-assistant alternation.

Q. Why iterate the content array instead of indexing it?

Because content is an array of blocks, not a string. Plain text response is one block; tool use, vision, and other features return multiple blocks of different types. response.content[0].text breaks the first time the model returns two blocks. Iterate and dispatch on type.

Q. The three current canonical model names?

claude-opus-4-8 (flagship Opus), claude-sonnet-4-6 (frontier intelligence at scale), claude-haiku-4-5 (fastest, near-frontier). Use the precise identifier the docs publish, not colloquial names like “Opus” or “Claude”. Lesson 3 covers selection.

Q. What does this lesson deliberately NOT cover, and where in T22 does each land?

Streaming → lesson 2. Tools → lessons 4 and 5. Model Context Protocol → lesson 6. Prompt caching → lesson 7. Agent loop → lessons 8 onward. Naming them keeps you from trying to learn everything at once; get the simplest call working first.