The Anthropic Message Batches API lets you send up to 100,000 Claude requests in a single batch and pay 50% less per token. Batches process asynchronously – most finish within an hour, with a maximum window of 24 hours. If you’re running evals, classifying documents, or generating summaries at scale, this is the cheapest way to use Claude.
Here’s a minimal batch with two requests:
| |
You get back a batch ID and a status of in_progress. Each request needs a unique custom_id so you can match results back to inputs – the API doesn’t guarantee result order.
Setting Up
Install the Anthropic Python SDK:
| |
Set your API key as an environment variable:
| |
The SDK reads ANTHROPIC_API_KEY automatically, so anthropic.Anthropic() works without passing the key explicitly.
Building a Real Batch: Classifying Support Tickets
A two-request example is fine for syntax. Here’s something closer to production – classifying 100 customer support tickets in a single batch.
| |
Each request in the batch takes the same parameters as a regular client.messages.create() call – model, max_tokens, system, messages, temperature, tools, all of it. You can even mix different models and parameter combinations within the same batch.
Polling for Completion
Batches are asynchronous. You submit the work, then check back later. Here’s a polling loop:
| |
The request_counts object gives you a live tally of how many requests are still processing, how many succeeded, and how many hit errors or expired. All counts except processing stay at zero until the entire batch finishes.
Most batches complete in under an hour. The hard ceiling is 24 hours – anything still running after that expires.
Retrieving and Processing Results
Once the batch status is ended, stream the results:
| |
Results come back as a stream of individual entries. Each entry has a custom_id matching your original request and a result with one of four types:
succeeded– the request completed. Themessagefield contains the full Claude response.errored– something went wrong. Checkresult.error.typeforinvalid_request(fix your input) or a server error (retry).canceled– you canceled the batch before this request was processed.expired– the 24-hour window ran out before this request was reached.
Results are not in the same order as your input requests. Always match on custom_id.
Cost Comparison: Standard vs. Batch
The batch API charges exactly 50% of standard per-token pricing. Here’s what that looks like for Claude Sonnet 4:
| Method | Input Cost | Output Cost |
|---|---|---|
| Standard API | $3 / MTok | $15 / MTok |
| Batch API | $1.50 / MTok | $7.50 / MTok |
If you’re processing 10,000 tickets at roughly 100 input tokens and 10 output tokens each:
- Standard: (1M input tokens * $3) + (100K output tokens * $15) = $4.50
- Batch: (1M input tokens * $1.50) + (100K output tokens * $7.50) = $2.25
That’s a straight 50% savings with no change to output quality. The tradeoff is latency – you wait minutes to hours instead of getting a response in seconds.
You can stack batch pricing with prompt caching for even deeper discounts. Cached input tokens in a batch cost 90% less than standard input pricing, so you’re paying 5% of the base rate for repeated context.
Common Errors and Fixes
anthropic.BadRequestError: requests: each request must have a unique custom_id
Every custom_id in a single batch must be unique. If you’re generating IDs from a list, make sure there are no duplicates:
| |
413 request_too_large
Your batch exceeds the 256 MB size limit. Split it into smaller batches:
| |
result.error.type == "invalid_request"
One or more requests in the batch had malformed parameters. The batch still processes – other requests aren’t affected. Common causes:
- Missing required fields (
model,max_tokens,messages) - Empty
messagesarray - Invalid model name (typo or deprecated model)
Fix the request params and resubmit just the failed ones in a new batch.
Requests showing as expired
Your batch didn’t finish within 24 hours. This happens during high-demand periods or with very large batches. Solutions:
- Break large batches into smaller ones (2,000-5,000 requests each)
- Retry expired requests in a new batch
- Check the Anthropic status page for capacity issues
anthropic.AuthenticationError
Your ANTHROPIC_API_KEY environment variable isn’t set or is invalid. Double-check:
| |
Related Guides
- How to Use the Anthropic Extended Thinking API for Complex Reasoning
- How to Use the Anthropic Message Batches API for Async Workloads
- How to Use the xAI Grok API for Chat and Function Calling
- How to Use the Anthropic Token Efficient Tool Use API
- How to Use the DeepSeek API for Code and Reasoning Tasks
- How to Use the Anthropic Token Streaming API for Real-Time UIs
- How to Use the Anthropic Citations API for Grounded Responses
- How to Use LiteLLM as a Universal LLM API Gateway
- How to Use the OpenRouter API for Multi-Provider LLM Access
- How to Use the Together AI API for Open-Source LLMs