OpenRouter gives you a single API endpoint that routes to 400+ models across OpenAI, Anthropic, Google, Meta, Mistral, and dozens of other providers. It’s OpenAI SDK-compatible, so you swap one base URL and your existing code works against any model.
Here’s the fastest path to a working call:
| |
| |
That’s it. Same OpenAI SDK, different base_url, access to every major model family.
Authentication and Model Selection
Sign up at openrouter.ai and grab an API key from the dashboard. Keys start with sk-or-v1-. You can optionally pass attribution headers so OpenRouter’s leaderboard tracks your app:
| |
Model IDs follow a provider/model-name pattern. Some commonly used ones:
| Model ID | Provider |
|---|---|
openai/gpt-4o | OpenAI |
anthropic/claude-sonnet-4 | Anthropic |
google/gemini-2.5-pro-preview | |
meta-llama/llama-4-maverick | Meta |
mistralai/mistral-large | Mistral |
Browse the full list at openrouter.ai/models. Pricing varies per model and OpenRouter adds no markup on most models.
Streaming Responses
Streaming works exactly like the OpenAI SDK. Set stream=True and iterate over chunks:
| |
Each chunk follows the standard OpenAI SSE format. The final chunk has choices[0].finish_reason set to "stop". If you’re already handling OpenAI streams, nothing changes.
Fallback Routing and Provider Preferences
This is where OpenRouter shines over calling providers directly. You can define fallback models so if your primary model is down, rate-limited, or refuses the request, the next model in the list handles it automatically.
Pass the models array through extra_body:
| |
If Anthropic returns an error, OpenRouter tries GPT-4o, then Gemini. You only pay for the model that actually succeeds.
You can also control provider ordering and performance thresholds:
| |
The order field sets which infrastructure providers to try first for that model. allow_fallbacks (default True) controls whether OpenRouter tries other providers if your preferred ones fail. Setting require_parameters to True ensures the provider supports all features you’ve requested (like function calling or JSON mode) before routing to it.
For latency-sensitive workloads, you can sort providers by throughput:
| |
Cost Tracking
Every response includes a usage field with token counts and cost:
| |
For detailed cost breakdowns after the fact, query the generation endpoint with the response ID:
| |
This gives you native token counts (the actual tokenizer’s count, not the OpenAI-normalized count), total cost in USD, and generation latency. Useful for building dashboards or setting budget alerts.
Common Errors and Fixes
401 Unauthorized – Your API key is missing or invalid. Double-check the OPENROUTER_API_KEY env var. Keys must start with sk-or-v1-.
402 Payment Required – Your OpenRouter account has insufficient credits. Add credits at openrouter.ai/credits. Free-tier models exist but have rate limits.
404 Model Not Found – The model ID is wrong. Model IDs are case-sensitive and follow provider/model-name format. Check openrouter.ai/models for the exact ID.
429 Rate Limited – You’ve hit the rate limit for a specific provider. Use fallback models to route around this automatically:
| |
Streaming hangs or returns empty chunks – Some models don’t support streaming. Check the model’s page on OpenRouter for supported features. If you’re behind a proxy or CDN, make sure it’s not buffering SSE responses.
extra_body ignored silently – If provider preferences aren’t taking effect, make sure you’re passing them as a top-level extra_body kwarg, not nested inside messages or headers.
Related Guides
- How to Use the Perplexity API for AI-Powered Search
- How to Use the Anthropic Extended Thinking API for Complex Reasoning
- How to Use the Anthropic Message Batches API for Async Workloads
- How to Use the Fireworks AI API for Fast Open-Source LLMs
- How to Use the xAI Grok API for Chat and Function Calling
- How to Use the Anthropic Batch API for High-Volume Processing
- How to Use the Anthropic Token Efficient Tool Use API
- How to Use the DeepSeek API for Code and Reasoning Tasks
- How to Use the Together AI API for Open-Source LLMs
- How to Use the Anthropic Token Counting API for Cost Estimation