LangGraph gives you something most agent frameworks don’t: fine-grained control over every decision your agent makes. Instead of dumping a prompt and tools into a black box, you define a graph where each node is a step – call an LLM, run a tool, check a condition – and edges control the flow between them. State persists automatically across the entire execution.
As of February 2026, LangGraph is at version 1.0.8, runs on Python 3.10+, and ships with durable state persistence, human-in-the-loop patterns, and built-in checkpointing. Here’s how to actually use it.
Install LangGraph
| |
You’ll also need an OpenAI API key (or swap in any LangChain-compatible model). Export it before running anything:
| |
Define Your Agent’s State
Every LangGraph agent revolves around a state object. Think of it as the shared notebook your agent carries between steps. You define it as a TypedDict, and LangGraph handles passing it from node to node.
| |
The Annotated[list[AnyMessage], add_messages] bit is critical. It tells LangGraph to append new messages to the list rather than overwriting it. Without that reducer, you’ll hit an InvalidUpdateError the moment two nodes try to write to the same key.
Build a Tool-Calling Agent
Here’s a complete agent that can search the web and answer questions. It uses a StateGraph with three nodes: one for the LLM, one for tools, and a router that decides which to call next.
| |
The flow works like this: the user message enters at START, hits the agent node (LLM call), then the router checks if the LLM wants to call a tool. If yes, it routes to the tools node, which executes the tool and sends results back to agent. If no tool calls, it routes to __end__ and returns the response.
Stream Responses
For real-time output, swap invoke for stream:
| |
This gives you token-by-token output and lets you see each step of the agent’s reasoning as it happens.
Add Memory with Checkpointing
LangGraph’s killer feature is durable state. Add a checkpointer and your agent remembers previous conversations:
| |
The thread_id acts as a session key. Same thread, same memory. Different thread, fresh start. For production, swap MemorySaver for a database-backed checkpointer like SqliteSaver or PostgresSaver.
Common Errors and How to Fix Them
GraphRecursionError
This is the one you’ll hit most often. It means your agent looped too many times before reaching END.
| |
The default recursion_limit is 25. If your agent genuinely needs more steps (multi-hop research, complex tool chains), bump it:
| |
But if you’re hitting this unexpectedly, your routing logic probably has a bug. A common culprit is a should_continue function that never returns "__end__". Double-check that your conditional edge handles the case where tool_calls is empty.
InvalidUpdateError
| |
This happens when two nodes running in parallel both try to write to the same state key. Fix it by adding a reducer to that key in your state definition:
| |
INVALID_GRAPH_NODE_RETURN_VALUE
Your node function returned something LangGraph can’t use. Nodes must return a dictionary with keys that match your state schema:
| |
Missing API Key Errors
| |
This one is self-explanatory but trips people up in containers and CI environments where environment variables aren’t forwarded. Make sure OPENAI_API_KEY (or whatever provider you use) is set in the process running your agent, not just in your shell profile.
When to Use LangGraph vs. Simpler Approaches
LangGraph is the right tool when your agent needs:
- Branching logic – different paths based on intermediate results
- Persistent memory – conversations that survive restarts
- Human-in-the-loop – pausing execution for approval before a tool runs
- Parallel execution – running multiple tool calls simultaneously
- Complex multi-step workflows – chains of 5+ steps with conditional routing
If you just need a single LLM call with one tool, ChatOpenAI.bind_tools() alone is enough. Don’t reach for a graph when a function will do.
Production Tips
Set explicit recursion limits. The default of 25 is fine for development but set it intentionally in production so you know exactly when and why an agent stops.
Use LangSmith for tracing. Debugging agent loops without traces is like debugging distributed systems without logs. LangSmith shows you every node transition, state change, and LLM call in a visual timeline.
Keep your state schema tight. Every key you add to state gets serialized on every checkpoint. Don’t store large blobs or raw API responses – extract what you need and discard the rest.
Handle tool errors gracefully. Wrap tool functions in try/except and return error messages as strings rather than letting exceptions crash the graph. The LLM can often recover from a failed tool call if you tell it what went wrong.