Skip to content

Cheatsheet: The Messages API in production

Python (context manager):
with client.messages.stream(...) as stream:
for text in stream.text_stream:
print(text, end="", flush=True)
# If you want the full Message object instead of chunks:
with client.messages.stream(...) as stream:
message = stream.get_final_message()
TypeScript (event-emitter):
await client.messages
.stream({...})
.on("text", (text) => { process.stdout.write(text); });
// Full Message object:
const stream = client.messages.stream({...});
const message = await stream.finalMessage();
PatternWhen
Standard non-streamingShort call, no user waiting, no long generation
StreamingInteractive UI (user is watching) OR request expected to take more than 10 minutes
BatchesBulk non-interactive work (evals, moderation, dataset summarization)

Decision driver: who is waiting. User → streaming. Nobody → batches.

CodeTypeRetry?
400invalid_request_errorNo (your bug; fix the request)
401authentication_errorNo (API key missing/wrong/revoked)
402billing_errorNo (fix in Console)
403permission_errorNo (key lacks permission)
413request_too_largeNo (trim the request; see size limits below)
429rate_limit_errorYes, backoff (SDKs do this)
500api_errorYes, backoff (SDKs do this)
504timeout_errorSwitch to streaming or batches for long requests
529overloaded_errorYes, backoff (SDKs do this)

Rule of thumb: 4xx is your bug, 5xx is the platform’s bug, 429 and 529 are temporary.

{
"type": "error",
"error": {
"type": "...",
"message": "..."
},
"request_id": "req_..."
}

Log all three: error.type, error.message, request_id.

EndpointMax request size
Messages API32 MB
Token Counting API32 MB
Batch API256 MB
Files API500 MB
  • Official SDKs retry connection errors, 408, 409, 429, and any 5xx status code with exponential backoff + jitter by default (about two retries).
  • You configure max attempts and per-call timeout.
  • What the SDK CANNOT decide: whether your request is safe to retry. Fix idempotency on the tool side (deduplication key, unique transaction id), not the API call.

Streaming can fail AFTER returning 200. Wrap the stream iterator:

# Python
try:
with client.messages.stream(...) as stream:
for text in stream.text_stream:
...
except anthropic.APIError as e:
log_error(e)
// TypeScript
await client.messages
.stream({...})
.on("text", handleText)
.on("error", handleError);

Missing this: mid-stream failures get logged as “succeeded” because the HTTP status was 200.

PropertyValue
Cost vs standard50 percent less per token
Typical completionMost batches finish in under 1 hour
Per-batch size limit256 MB
Per-request size limit32 MB (same as Messages API)

Right for: large-scale evaluations, content moderation, dataset summarization, nightly jobs. Wrong for: anything a user is waiting on (use streaming).

batch = client.messages.batches.create(
requests=[
{
"custom_id": "doc_001",
"params": {
"model": "claude-opus-4-8",
"max_tokens": 1024,
"messages": [{"role": "user", "content": "Summarize: ..."}],
},
},
# ... many more
],
)
# Later: poll until status is "ended", then stream results
results = client.messages.batches.results(batch.id)

Log on every response:

# Python
print(message._request_id)
// TypeScript
console.log(message._request_id);

A reasonable production log line: timestamp, request_id, model, stop_reason, input_tokens, output_tokens, latency. Quote the request_id in Anthropic Support tickets.

FailureRecognize byFix
Retrying 4xxSame 400 / 401 in retry logRead the error.type; fix the request, do not retry
Mid-stream error logged as successTruncated response in app, no error eventWrap stream iterator with error handler
Batches for user-facingUX feels asyncSwitch to streaming for the user path; keep batches for offline jobs
Tool not idempotentDuplicate side effects on retryAdd a deduplication key to the tool, not the API call
No request_id loggedCannot debug a specific failureLog response._request_id on every call

What this lesson does NOT cover (and where to find it)

Section titled “What this lesson does NOT cover (and where to find it)”
TopicLands at
Choosing the model + extended thinking + effortLesson 3
Tool use (define + handle)Lesson 4
Server-side tools (web search, code execution)Lesson 5
Model Context ProtocolLesson 6
Prompt caching + context managementLesson 7
Cost monitoring + Usage and Cost APILesson 12
  • Anthropic public Claude docs, Streaming Messages, Batch processing, Errors, and Working with the Messages API at https://platform.claude.com/docs/. Structural-mirror citation; see references.