Install the SDK and Set Your Key
| |
Grab your API key from console.mistral.ai and export it:
| |
The SDK reads MISTRAL_API_KEY from your environment automatically. You can also pass it directly to the client constructor if you prefer explicit configuration.
Chat Completions
The simplest call you can make. This is the bread and butter of the API.
| |
The model parameter takes a model alias like mistral-large-latest or a specific version like mistral-large-2407. Stick with the -latest aliases unless you need reproducibility.
Adding System Prompts
System prompts go as the first message with role: "system". Use them to set tone, constraints, and output format.
| |
Lower temperature values (0.0-0.3) produce more deterministic outputs. For creative tasks, bump it to 0.7-0.9.
Code Generation with Codestral
Codestral is Mistral’s code-specific model. It beats the general-purpose models on code generation tasks and runs faster for code completions. Use it whenever you need to generate, review, or refactor code.
| |
Codestral also supports fill-in-the-middle (FIM) completions, which is what IDE integrations like Continue and VS Code use. FIM lets you provide a prefix and suffix, and the model fills the gap:
| |
FIM is great for autocomplete-style features. The model sees both what comes before and after the cursor.
Streaming Responses
For anything user-facing, stream the response. Nobody wants to stare at a blank screen waiting for 2000 tokens to generate.
| |
Each chunk arrives as soon as the model generates it. The delta.content field contains the new text fragment.
Function Calling
Function calling lets the model decide when to call your tools and what arguments to pass. Define your tools as JSON schemas, and Mistral figures out the rest.
| |
Set tool_choice="auto" to let the model decide whether to call a function. Use tool_choice="any" to force a tool call, or tool_choice="none" to prevent them.
Choosing the Right Model
Mistral offers several models, and picking the right one matters for both cost and quality.
Mistral Large (mistral-large-latest) is your best general-purpose model. It handles complex reasoning, multi-step instructions, and nuanced tasks well. Use it when quality matters more than speed. This is the model I reach for first.
Mistral Small (mistral-small-latest) is fast and cheap. It handles straightforward tasks like classification, extraction, and simple Q&A. Use it for high-volume workloads where you need to keep costs down. It’s significantly cheaper per token than Large.
Codestral (codestral-latest) is purpose-built for code. It outperforms Mistral Large on coding benchmarks and supports FIM completions. If your task is code generation, completion, or review, always pick Codestral over the general-purpose models.
Mistral Medium has been deprecated. If you were using it, migrate to Mistral Small for cost savings or Mistral Large for quality.
My recommendation: start with Mistral Large for prototyping, then drop down to Small once you have evals that confirm it works for your use case. For any code task, go straight to Codestral.
Common Errors
MistralAPIStatusException: 401 Unauthorized – Your API key is missing or invalid. Verify it’s exported correctly with echo $MISTRAL_API_KEY. If you just created the key, wait a minute for it to propagate.
MistralAPIStatusException: 429 Too Many Requests – You’ve hit the rate limit. The SDK does not auto-retry by default. Wrap your calls in a retry loop with exponential backoff, or use a library like tenacity:
| |
MistralAPIStatusException: 400 with tool calls – This usually means the tool_call_id in your tool response message doesn’t match the one the model returned. Always use tool_call.id from the original response, don’t generate your own.
AttributeError: 'NoneType' object has no attribute 'content' – The model returned an empty response. This happens when max_tokens is too low for the task, or when the model decides to call a tool instead of responding with text. Check message.tool_calls before accessing message.content.
Codestral FIM returns garbage – Make sure you’re using the fim.complete method, not chat.complete. FIM uses a different endpoint and prompt format. Also verify your prompt ends where you want the completion to start.
Related Guides
- How to Use the Anthropic Python SDK for Claude
- How to Use the Anthropic Multi-Turn Conversation API with Tool Use
- How to Use the Google Vertex AI Gemini API for Multimodal Tasks
- How to Use the AWS Bedrock Converse API for Multi-Model Chat
- How to Use the Anthropic Token Counting API for Cost Estimation
- How to Use the Anthropic Claude Files API for Large Document Processing
- How to Use the Anthropic PDF Processing API for Document Analysis
- How to Use the Weights and Biases Prompts API for LLM Tracing
- How to Use Claude’s Model Context Protocol (MCP)
- How to Build Apps with the Gemini API and Python SDK