Quick Start: Two Agents Solving a Task

AutoGen’s core idea is simple — you define agents with roles, put them in a conversation, and they collaborate to solve a problem. Here’s the fastest path to a working multi-agent setup:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
import autogen

config_list = [
    {"model": "gpt-4o", "api_key": "sk-..."}
]

llm_config = {"config_list": config_list}

# An assistant that writes code
assistant = autogen.AssistantAgent(
    name="coder",
    llm_config=llm_config,
    system_message="You are a Python developer. Write clean, working code."
)

# A proxy that executes code and relays results
user_proxy = autogen.UserProxyAgent(
    name="executor",
    human_input_mode="NEVER",
    code_execution_config={
        "work_dir": "workspace",
        "use_docker": True,
    },
    max_consecutive_auto_reply=5,
)

# Kick off the conversation
user_proxy.initiate_chat(
    assistant,
    message="Write a Python script that fetches the top 5 Hacker News stories and prints their titles."
)

That’s it. The assistant writes code, the executor runs it in a Docker container, and if something fails, the assistant fixes it. This back-and-forth continues until the task succeeds or hits the reply limit.

Understanding the Agent Types

AutoGen ships with a few agent classes, but you’ll mostly use two.

AssistantAgent

This is your LLM-powered worker. It receives messages, reasons about them, and responds — usually with code or analysis. It doesn’t execute anything itself. Think of it as the brain.

UserProxyAgent

Despite the name, this isn’t really about users. It’s an agent that can execute code, call tools, and optionally ask a human for input. The human_input_mode parameter controls this:

  • "ALWAYS" — asks for human approval before every reply
  • "TERMINATE" — asks only when the conversation would end
  • "NEVER" — fully autonomous, no human in the loop

ConversableAgent

Both AssistantAgent and UserProxyAgent inherit from ConversableAgent. If you need custom behavior, subclass this directly:

1
2
3
4
5
6
7
8
class ReviewerAgent(autogen.ConversableAgent):
    def __init__(self, **kwargs):
        super().__init__(
            name="reviewer",
            system_message="Review code for bugs, security issues, and style. Be direct.",
            llm_config=llm_config,
            **kwargs
        )

Group Chat: Multi-Agent Collaboration

Two agents talking is useful. Three or more agents with distinct roles is where AutoGen gets interesting. GroupChat and GroupChatManager handle the orchestration.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
import autogen

config_list = [{"model": "gpt-4o", "api_key": "sk-..."}]
llm_config = {"config_list": config_list}

coder = autogen.AssistantAgent(
    name="coder",
    llm_config=llm_config,
    system_message="You write Python code to solve tasks. Only output code blocks.",
)

reviewer = autogen.AssistantAgent(
    name="reviewer",
    llm_config=llm_config,
    system_message="You review code for correctness and edge cases. Point out specific issues.",
)

executor = autogen.UserProxyAgent(
    name="executor",
    human_input_mode="NEVER",
    code_execution_config={"work_dir": "workspace", "use_docker": True},
    max_consecutive_auto_reply=3,
)

planner = autogen.AssistantAgent(
    name="planner",
    llm_config=llm_config,
    system_message=(
        "You break down tasks into steps and assign them to coder or reviewer. "
        "Say TERMINATE when the task is complete."
    ),
)

group_chat = autogen.GroupChat(
    agents=[planner, coder, reviewer, executor],
    messages=[],
    max_round=15,
    speaker_selection_method="auto",  # LLM picks the next speaker
)

manager = autogen.GroupChatManager(
    groupchat=group_chat,
    llm_config=llm_config,
)

executor.initiate_chat(
    manager,
    message="Build a CLI tool that converts CSV files to JSON with column type inference."
)

The speaker_selection_method matters. "auto" uses the LLM to decide who speaks next based on conversation context. You can also use "round_robin" for predictable turn order, or pass a custom function for full control.

Tool Registration

Agents can call Python functions as tools. This is better than having the LLM generate code when you already know exactly what function to call.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
from typing import Annotated

assistant = autogen.AssistantAgent(
    name="assistant",
    llm_config=llm_config,
)

user_proxy = autogen.UserProxyAgent(
    name="user_proxy",
    human_input_mode="NEVER",
)

@user_proxy.register_for_execution()
@assistant.register_for_llm(description="Search a database by query string")
def search_database(
    query: Annotated[str, "The search query"],
    limit: Annotated[int, "Max results to return"] = 10,
) -> list[dict]:
    # Your actual database search logic
    import sqlite3
    conn = sqlite3.connect("data.db")
    cursor = conn.execute(
        "SELECT * FROM documents WHERE content LIKE ? LIMIT ?",
        (f"%{query}%", limit)
    )
    results = [dict(zip([d[0] for d in cursor.description], row)) for row in cursor.fetchall()]
    conn.close()
    return results

The decorator pattern is clean — register_for_llm tells the assistant the tool exists, and register_for_execution tells the proxy to actually run it when the assistant calls it.

Docker Code Execution

Running LLM-generated code on your machine without sandboxing is asking for trouble. AutoGen supports Docker-based execution out of the box.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
code_execution_config = {
    "work_dir": "workspace",
    "use_docker": "python:3.12-slim",  # specific image
    "timeout": 120,  # seconds before killing the container
}

executor = autogen.UserProxyAgent(
    name="executor",
    code_execution_config=code_execution_config,
    human_input_mode="NEVER",
)

Make sure Docker is running on your machine. AutoGen pulls the image automatically if it’s not cached. You can also pass a custom Dockerfile path if your agents need specific packages pre-installed.

If you set use_docker: False, code runs directly on your host. Only do this in throwaway environments — never on production machines.

AutoGen Studio

If you want to prototype agent workflows visually before writing code, AutoGen Studio is worth a look. It’s a web UI that lets you configure agents, define group chats, and test conversations without touching Python.

Install it with:

1
2
pip install autogenstudio
autogenstudio ui --port 8080

It’s useful for experimenting with system messages and agent configurations. Once you’ve nailed the setup, export the config and move to code for production.

Common Errors

Docker is not running or docker.errors.DockerException

AutoGen tries to spin up containers for code execution. If Docker isn’t installed or the daemon isn’t running, you get this error. Start Docker or set use_docker: False (not recommended for untrusted code).

Rate limit exceeded with OpenAI

Multi-agent conversations burn through tokens fast. A 15-round group chat with 4 agents can easily hit rate limits. Use config_list with multiple API keys or add a fallback model:

1
2
3
4
config_list = [
    {"model": "gpt-4o", "api_key": "sk-key1"},
    {"model": "gpt-4o-mini", "api_key": "sk-key2"},  # fallback
]

Agents loop forever without solving the task

Set max_consecutive_auto_reply on your UserProxyAgent and max_round on your GroupChat. Without these limits, agents can keep talking in circles. Also make sure at least one agent’s system message includes instructions to say TERMINATE when the task is done.

ModuleNotFoundError inside Docker containers

The default Docker image is minimal. If your generated code imports packages like pandas or requests, the execution fails. Either use a custom image with those packages pre-installed or add a pip install step in the generated code. You can also set up a custom Dockerfile:

1
2
3
4
code_execution_config = {
    "work_dir": "workspace",
    "use_docker": "my-custom-python:latest",
}

No agent can execute the code

This happens when you have AssistantAgent instances but no UserProxyAgent with code_execution_config. At least one agent in the conversation needs execution capabilities.