The xAI Grok API is OpenAI-compatible. That means you can use the OpenAI Python SDK you already know, point it at https://api.x.ai/v1, and talk to Grok models without learning a new client library. Same request format, same response objects, same tools parameter for function calling.
Here’s what you need to get started: an xAI API key from console.x.ai and the OpenAI Python package.
| |
Set your API key as an environment variable:
| |
Basic Chat Completion
Point the OpenAI client at xAI’s endpoint and make a standard chat completion call. The grok-3-latest model gives you Grok 3 with a 131K token context window.
| |
That’s it. The response object is identical to what you’d get from OpenAI. You get response.choices[0].message.content for the text, response.usage for token counts, and everything else you’d expect.
Streaming Responses
For long responses, streaming prints tokens as they arrive instead of waiting for the full completion. Pass stream=True and iterate over chunks.
| |
Each chunk has chunk.choices[0].delta.content. It’s None on the final chunk, so always check before printing. The flush=True makes sure tokens appear immediately in the terminal.
Multi-Turn Conversations
Grok handles multi-turn conversations the same way as OpenAI. Keep appending messages to your list and send the full history with each request.
| |
The model sees the entire conversation each time. Keep the history growing and trim older messages when you approach the context limit.
Function Calling with Tools
Function calling lets Grok request data from your code mid-conversation. You define tools with JSON Schema, the model decides when to call them, and you execute the function locally and return the result.
Here’s a complete weather tool example:
| |
The flow is three steps. First, send the user message with the tools parameter. Second, if the response contains tool_calls, execute each function and append the results as role: "tool" messages. Third, send the updated messages back so the model can generate a final answer using the tool results.
Grok supports parallel function calling by default. When you ask about multiple cities, it may return multiple tool_calls in a single response. The loop handles that automatically.
Choosing the Right Model
xAI offers several Grok variants at different price points:
| Model | Context | Input Price | Output Price | Best For |
|---|---|---|---|---|
grok-3-latest | 131K | $3.00/M | $15.00/M | General tasks, high quality |
grok-3-mini-latest | 131K | $0.30/M | $0.50/M | Fast, cheaper tasks |
The grok-3-mini-latest model is roughly 10x cheaper than grok-3-latest and works well for straightforward tasks. Use the full grok-3-latest when you need stronger reasoning or complex function calling chains.
Common Errors and Fixes
openai.AuthenticationError: Error code: 401
Your API key is missing or invalid. Double-check you exported it correctly:
| |
openai.NotFoundError: Error code: 404
Wrong model name. The xAI API is strict about model IDs. Use exactly grok-3-latest or grok-3-mini-latest. Common mistakes include grok3, grok-3, or grok-latest.
openai.APITimeoutError
Grok can take longer on complex prompts. Increase the timeout:
| |
Tool calls return empty or malformed JSON
If json.loads(tool_call.function.arguments) fails, the model generated invalid JSON. This is rare but happens with vague tool descriptions. Fix it by making your description and parameters fields as specific as possible. Adding enum constraints and explicit descriptions for each property helps the model generate correct arguments.
tool_call_id mismatch
Every tool result message must include the exact tool_call_id from the corresponding tool call. If you mix them up, the API returns a 400 error. Always pair them in a loop like the example above.
Related Guides
- How to Use the Anthropic Batch API for High-Volume Processing
- How to Use the Anthropic Extended Thinking API for Complex Reasoning
- How to Use the Anthropic Message Batches API for Async Workloads
- How to Use the DeepSeek API for Code and Reasoning Tasks
- How to Use the OpenAI Realtime API for Voice Applications
- How to Use the OpenRouter API for Multi-Provider LLM Access
- How to Use the Anthropic Citations API for Grounded Responses
- How to Use the Anthropic Multi-Turn Conversation API with Tool Use
- How to Use the Anthropic Tool Use API for Agentic Workflows
- How to Use LiteLLM as a Universal LLM API Gateway