An AI agent that can send emails, delete records, or execute code without asking first is a liability. Human-in-the-loop (HITL) approval lets you insert checkpoints where the agent pauses, surfaces its planned action, and waits for a human to approve, edit, or reject before proceeding. The agent keeps its full state in memory – no lost context, no restarts.
LangGraph’s interrupt() function is the cleanest way to build this. You call interrupt() inside any node or tool, the graph freezes, and you resume it later with Command(resume=...). The return value of interrupt() becomes whatever you passed into resume. Here’s how to wire it up.
Install Dependencies#
1
| pip install langgraph langchain-anthropic langchain-core
|
Set your API key:
1
| export ANTHROPIC_API_KEY="sk-ant-your-key-here"
|
Basic Approval Gate#
The simplest pattern: a node that pauses execution and waits for a yes/no decision. You need two things – an interrupt() call and a checkpointer to persist state while the graph is paused.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
| from typing_extensions import TypedDict
from langgraph.graph import StateGraph, START, END
from langgraph.types import Command, interrupt
from langgraph.checkpoint.memory import InMemorySaver
class State(TypedDict):
action: str
approved: bool
def plan_action(state: State) -> dict:
return {"action": "Delete 1,432 inactive user accounts"}
def approval_gate(state: State) -> dict:
decision = interrupt({
"question": "Approve this action?",
"planned_action": state["action"],
})
return {"approved": decision == "approve"}
def execute_action(state: State) -> dict:
if state["approved"]:
print(f"Executing: {state['action']}")
else:
print("Action rejected by reviewer.")
return {}
builder = StateGraph(State)
builder.add_node("plan", plan_action)
builder.add_node("approve", approval_gate)
builder.add_node("execute", execute_action)
builder.add_edge(START, "plan")
builder.add_edge("plan", "approve")
builder.add_edge("approve", "execute")
builder.add_edge("execute", END)
checkpointer = InMemorySaver()
graph = builder.compile(checkpointer=checkpointer)
|
Run it in two phases. First, the graph executes until it hits the interrupt:
1
2
3
4
5
6
| config = {"configurable": {"thread_id": "review-001"}}
# Phase 1: runs plan -> hits interrupt at approval_gate
result = graph.invoke({"action": ""}, config=config)
print(result)
# The graph is now paused. State is saved to the checkpointer.
|
Then, after a human reviews and decides, resume with their answer:
1
2
3
4
| # Phase 2: human approves -> executes the action
result = graph.invoke(Command(resume="approve"), config=config)
print(result)
# Output: Executing: Delete 1,432 inactive user accounts
|
The thread_id ties the two invocations together. Without it, LangGraph has no way to find the saved checkpoint.
The real power shows up when you put interrupt() inside a tool definition. The agent calls the tool normally, but execution freezes before anything dangerous happens. The human can approve, modify the arguments, or reject entirely.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
| from langchain_core.tools import tool
from langchain_anthropic import ChatAnthropic
from langgraph.prebuilt import create_react_agent
from langgraph.types import interrupt
from langgraph.checkpoint.memory import InMemorySaver
@tool
def send_email(to: str, subject: str, body: str) -> str:
"""Send an email to a recipient."""
response = interrupt({
"action": "send_email",
"to": to,
"subject": subject,
"body": body,
"message": "Review this email before sending.",
})
if response.get("action") == "reject":
return f"Email to {to} was rejected: {response.get('reason', 'no reason given')}"
# Allow the human to override any field
final_to = response.get("to", to)
final_subject = response.get("subject", subject)
final_body = response.get("body", body)
# In production, actually send the email here
return f"Email sent to {final_to}: {final_subject}"
@tool
def search_web(query: str) -> str:
"""Search the web. No approval needed."""
return f"Results for '{query}': LangGraph supports interrupt() for HITL workflows."
model = ChatAnthropic(model="claude-sonnet-4-20250514")
checkpointer = InMemorySaver()
agent = create_react_agent(
model=model,
tools=[send_email, search_web],
checkpointer=checkpointer,
)
|
Now when the agent decides to call send_email, execution pauses automatically:
1
2
3
4
5
6
7
8
9
10
11
12
13
| config = {"configurable": {"thread_id": "email-task-42"}}
# Agent plans an email -- hits interrupt inside send_email
result = agent.invoke(
{"messages": [{"role": "user", "content": "Email [email protected] about the Q1 report."}]},
config=config,
)
# Human reviews and approves with a subject edit
result = agent.invoke(
Command(resume={"action": "approve", "subject": "Q1 Report Summary - Final"}),
config=config,
)
|
The search_web tool has no interrupt() call, so it runs immediately. This selective approval pattern means you only slow down the agent for actions that actually need oversight.
Conditional Approval with Risk Thresholds#
You don’t always want to interrupt. Low-risk actions should flow through automatically, while high-risk ones pause for review. Use a wrapper that checks conditions before deciding whether to interrupt:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
| from langgraph.types import interrupt
SENSITIVE_TOOLS = {"send_email", "delete_records", "execute_sql", "deploy_service"}
def require_approval_if_sensitive(tool_name: str, args: dict) -> dict | None:
"""Interrupt only for sensitive operations. Returns human response or None."""
if tool_name not in SENSITIVE_TOOLS:
return None
response = interrupt({
"tool": tool_name,
"args": args,
"message": f"The agent wants to call '{tool_name}'. Approve?",
})
return response
@tool
def delete_records(table: str, condition: str) -> str:
"""Delete records from a database table."""
response = require_approval_if_sensitive(
"delete_records", {"table": table, "condition": condition}
)
if response and response.get("action") == "reject":
return f"Deletion rejected: {response.get('reason', '')}"
# Proceed with deletion
return f"Deleted records from {table} where {condition}"
|
This keeps your agent fast for safe operations and cautious for destructive ones.
Production Checkpointing#
InMemorySaver works for development, but it vanishes when your process dies. For production, use a persistent backend. LangGraph ships adapters for SQLite and PostgreSQL:
1
2
3
4
5
6
| # SQLite -- good for single-server deployments
from langgraph.checkpoint.sqlite.aio import AsyncSqliteSaver
async with AsyncSqliteSaver.from_conn_string("checkpoints.db") as checkpointer:
graph = builder.compile(checkpointer=checkpointer)
# Graph state survives process restarts
|
1
2
| # For PostgreSQL (multi-server, production)
pip install langgraph-checkpoint-postgres
|
1
2
3
4
5
6
| from langgraph.checkpoint.postgres.aio import AsyncPostgresSaver
async with AsyncPostgresSaver.from_conn_string(
"postgresql://user:pass@localhost:5432/agents"
) as checkpointer:
graph = builder.compile(checkpointer=checkpointer)
|
With a persistent checkpointer, an interrupted graph can sit paused for hours or days. When the human finally responds, you resume from the exact checkpoint – even if the server restarted in between.
Common Errors and Fixes#
Graph runs straight through without pausing. You compiled without a checkpointer. The interrupt() call does nothing without one:
1
2
3
4
5
| # Wrong -- no checkpointer
graph = builder.compile()
# Right
graph = builder.compile(checkpointer=InMemorySaver())
|
ValueError: Checkpointer requires thread_id appears when you forget to pass a config with a thread ID. Every invocation that uses checkpointing needs one:
1
2
3
4
5
| # Wrong
graph.invoke({"action": "test"})
# Right
graph.invoke({"action": "test"}, {"configurable": {"thread_id": "any-unique-id"}})
|
NodeInterrupt showing as “Error” in LangSmith traces. This is expected behavior, not a bug. LangGraph surfaces interrupts as a special exception type internally. The overall run status will show as succeeded once you resume and the graph completes.
Resuming returns stale state. You’re creating a new checkpointer instance between the invoke and resume calls. The checkpointer must be the same object (in-memory) or point to the same database (persistent) so it can find the saved checkpoint.
When to Use Each Pattern#
interrupt() inside a tool is best when approval depends on what the LLM decided to do. The tool arguments are available for review and editing before execution happens.
interrupt() in a dedicated gate node works when you want a fixed checkpoint in your workflow – every execution pauses there regardless of which tools were called.
Conditional interrupts with a sensitivity check give you the best of both worlds: automatic execution for safe operations and mandatory review for anything destructive.
Pick the pattern that matches your risk profile. Most production agents combine all three – a gate node for high-level plan approval, tool-level interrupts for sensitive actions, and automatic pass-through for reads and searches.