The OpenAI Agents SDK is OpenAI’s production framework for building multi-agent systems in Python. It replaced Swarm (the experimental repo from late 2024) with a proper package, real documentation, and features you actually need in production: typed tool definitions, agent handoffs, input/output guardrails, and built-in tracing. If you tried Swarm and found it too bare-bones, this is what it grew into.
The SDK is intentionally minimal compared to LangGraph or CrewAI. There’s no graph DSL, no YAML configs, no complex abstractions. You define agents as Python objects, give them tools and instructions, and run them. That’s it.
Install the SDK
| |
You need an OpenAI API key:
| |
The SDK requires Python 3.9+. It pulls in openai and pydantic as dependencies – if you already have those, the install is fast.
Your First Agent
An Agent is a wrapper around an LLM with instructions, tools, and optional handoff targets. Here’s the simplest possible agent:
| |
Runner.run_sync() is the synchronous entry point. It runs the agent loop – sending messages to the LLM, executing tool calls, handling handoffs – until the agent produces a final text response. There’s also an async version:
| |
Use the async version in production. The sync wrapper is convenient for scripts and notebooks but blocks the event loop.
Define Function Tools
Tools are where agents get useful. The SDK uses a @function_tool decorator that automatically generates the JSON schema from your function’s type hints and docstring.
| |
The type hints matter. The SDK uses them to build the parameters schema that gets sent to the model. If you skip type hints, the model won’t know what arguments your tool expects and will hallucinate them.
Tools with Pydantic Models
For complex inputs, use Pydantic models directly:
| |
This gives you automatic validation. If the model produces a bad date format or missing field, Pydantic catches it before your function runs.
Agent Handoffs
Handoffs are the SDK’s best feature. One agent can transfer control to another agent mid-conversation. The receiving agent picks up the full message history and continues.
| |
The triage agent reads the user’s message, decides it’s a technical issue, and hands off to technical_agent. The user never sees the handoff – they just get an answer from the right specialist. The result object tracks which agent produced the final output.
This pattern scales to any routing problem: language-based routing, tier-based support, domain-specific experts. You can chain handoffs too – technical_agent could hand off to an escalation_agent if it can’t solve the problem.
Guardrails
Guardrails run validation on inputs or outputs before and after agent execution. They’re separate from the agent’s instructions – think of them as programmable safety checks.
Input Guardrails
Input guardrails validate user messages before the agent processes them:
| |
When tripwire_triggered is True, the SDK raises an InputGuardrailTripwireTriggered exception and the agent never processes the message. You can also use output guardrails with OutputGuardrail to validate agent responses before they reach the user.
Tracing
Every agent run is automatically traced. The SDK captures each LLM call, tool execution, handoff, and guardrail check in a structured trace you can inspect in the OpenAI dashboard.
Traces are enabled by default. You’ll see them in your OpenAI dashboard under the Traces tab. For custom trace names:
| |
The trace() context manager groups related runs under a single trace. This is useful when you have multiple agent runs in a single request – a triage agent handing off to a specialist, for example. Both runs appear under one trace.
To disable tracing (for local development or testing):
| |
Common Errors and Fixes
openai.AuthenticationError: Incorrect API key
| |
Your OPENAI_API_KEY is missing or wrong. The SDK uses the openai client under the hood, so it reads from the same environment variable. Double-check with echo $OPENAI_API_KEY in the same shell where you run your script.
ModuleNotFoundError: No module named 'agents'
| |
You installed the wrong package. The correct install is pip install openai-agents, which gives you the agents module. A common mistake is running pip install agents – that’s a completely different package.
InputGuardrailTripwireTriggered
| |
An input guardrail blocked the message. This is working as intended – catch the exception and return a user-friendly error message instead of letting it bubble up.
ModelBehaviorError on handoffs
If your agent tries to hand off to a target that isn’t in its handoffs list, you’ll get a ModelBehaviorError. Make sure every agent that should be a handoff target is included in the handoffs parameter of the calling agent. Also check that your instructions mention the handoff targets by name – the model needs to know they exist.
Tool schema validation errors
| |
The model called your tool without required arguments. This usually means your tool’s docstring is unclear. Be explicit about what each parameter does and when the tool should be used.
How It Compares to LangGraph and CrewAI
vs. LangGraph: LangGraph gives you a graph-based execution model with conditional edges, checkpointing, and human-in-the-loop patterns. It’s more flexible but more complex. The Agents SDK is simpler – you don’t define a graph, you define agents and let the framework handle routing via handoffs. Pick LangGraph if you need fine-grained control over execution flow. Pick the Agents SDK if you want to get a multi-agent system running fast with less boilerplate.
vs. CrewAI: CrewAI focuses on “crews” of agents with roles and goals, using a sequential or hierarchical process. It’s higher-level and more opinionated. The Agents SDK is lower-level – you control the tool definitions, handoff logic, and guardrails yourself. CrewAI is better for rapid prototyping of role-based workflows. The Agents SDK is better when you need precise control over what each agent can and cannot do.
The Agents SDK’s main advantage is that it’s made by OpenAI. Tracing integrates directly with the OpenAI dashboard, the tool schema matches OpenAI’s function calling format natively, and you don’t need adapter layers between the SDK and the API.
Related Guides
- How to Use the Anthropic Extended Thinking API for Complex Reasoning
- How to Use the Anthropic Message Batches API for Async Workloads
- How to Use the OpenAI Realtime API for Voice Applications
- How to Use the OpenRouter API for Multi-Provider LLM Access
- How to Use the Anthropic Batch API for High-Volume Processing
- How to Use the Anthropic Python SDK for Claude
- How to Use the Anthropic Token Efficient Tool Use API
- How to Use the DeepSeek API for Code and Reasoning Tasks
- How to Use the Together AI API for Open-Source LLMs
- How to Use the Anthropic Token Counting API for Cost Estimation