The Agent Run Loop¶

In Exercise 02: Tool Loop you built the agent loop by hand — calling the API, checking for tool calls, executing them, appending results, and looping back. From Exercise 03 onward, every exercise uses a shared run() function that encapsulates exactly that loop.

This page walks through run() line by line so there's nothing hidden.

From Manual Loop to `run()`¶

Here's the connection. In Exercise 02: Tool Loop you wrote something like this:

while True:
    response = client.chat.completions.create(model=model, messages=messages, tools=tools)
    message = response.choices[0].message
    messages.append(message)

    if not message.tool_calls:
        break  # Final answer

    for tool_call in message.tool_calls:
        result = execute_tool(tool_call)
        messages.append({"role": "tool", "tool_call_id": tool_call.id, "content": result})

The shared run() function in exercises/commons/agent.py does the same thing — with logging, error handling, and a safety limit on iterations. No magic.

The `Agent` Dataclass¶

Before looking at run(), here's what it operates on:

@dataclass
class Agent:
    name: str              # For logging — "Billing Specialist", "Research Agent"
    system_prompt: str     # The system message that defines behavior
    tools: list            # Tool definitions from pydantic_function_tool()
    tool_functions: dict   # {"get_weather": get_weather_fn, ...}
    model: str = ""        # Model name (e.g., "gpt-4o")
    max_iterations: int = 10  # Safety valve

An Agent is just data — it doesn't do anything on its own. The run() function brings it to life.

The `run()` Function — Annotated¶

Here's the full function, broken into phases with explanations.

Signature¶

def run(
    agent: Agent,
    messages: list[dict],
    client: OpenAI | AzureOpenAI,
    model: str | None = None,
) -> str:

Parameter	Purpose
`agent`	The agent definition (prompt, tools, identity)
`messages`	The conversation so far — mutated in place as the loop runs
`client`	The OpenAI / Azure OpenAI client for API calls
`model`	Optional override — falls back to `agent.model`

Messages are mutated

The messages list you pass in grows during execution. After run() returns, it contains the full conversation including all assistant responses and tool results. This is by design — it lets you inspect the full trace or continue the conversation.

Phase 1: Setup¶

effective_model = model or agent.model

if not messages or messages[0].get("role") != "system":
    messages.insert(0, {"role": "system", "content": agent.system_prompt})

The system prompt is automatically prepended if it's not already there. This means callers don't need to manually add it — just pass user messages.

Phase 2: The Loop (Reason → Act → Observe)¶

iteration = 0

while iteration < agent.max_iterations:
    iteration += 1

The loop runs until one of two things happens:

The model produces a final text response (no tool calls) → return it
We hit max_iterations → safety exit

Phase 3: Build API Call with `**kwargs`¶

api_kwargs: dict[str, Any] = {
    "model": effective_model,
    "messages": messages,
}
if agent.tools:
    api_kwargs["tools"] = agent.tools

response = client.chat.completions.create(**api_kwargs)

This pattern uses **kwargs (keyword argument unpacking) to build the API call dynamically. Let's break this down since it's a Python pattern you'll see everywhere:

Python **kwargs Explained

In Python, ** before a dictionary unpacks it into keyword arguments. These two calls are identical:

# Explicit keyword arguments
client.chat.completions.create(
    model="gpt-4o",
    messages=[...],
    tools=[...],
)

# Equivalent using dict unpacking
api_kwargs = {
    "model": "gpt-4o",
    "messages": [...],
    "tools": [...],
}
client.chat.completions.create(**api_kwargs)

Why use this pattern? It lets you build arguments conditionally. In our case, we only include tools when the agent actually has tools defined:

api_kwargs = {"model": model, "messages": messages}

if agent.tools:           # Only add tools if the agent has them
    api_kwargs["tools"] = agent.tools

client.chat.completions.create(**api_kwargs)

Without this pattern, you'd need an if/else with two separate API calls — one with tools and one without. The dict approach is cleaner and scales to any number of optional parameters.

The reverse also exists: a function can receive arbitrary keyword arguments with **kwargs in its signature:

def my_function(**kwargs):
    print(kwargs)  # {'a': 1, 'b': 2}

my_function(a=1, b=2)

Phase 4: Reason — Check the Model's Decision¶

choice = response.choices[0]
assistant_message = choice.message

messages.append(assistant_message.model_dump())

if not assistant_message.tool_calls:
    return assistant_message.content or ""

The model has two possible responses:

No tool calls → it has a final text answer → return it (loop ends)
Has tool calls → it wants to use tools → continue to Phase 5

Note model_dump() — this converts the Pydantic response object to a plain dict so it can be appended to the messages list for the next API call.

Phase 5: Act — Execute Tool Calls¶

for tool_call in assistant_message.tool_calls:
    function_name = tool_call.function.name
    arguments = json.loads(tool_call.function.arguments)

    if function_name not in agent.tool_functions:
        result = f"Error: Unknown tool '{function_name}'"
    else:
        result = agent.tool_functions[function_name](**arguments)

Here **arguments appears again — the model returns arguments as a JSON object like {"city": "Berlin", "unit": "celsius"}, and **arguments unpacks it into keyword arguments for the Python function: get_weather(city="Berlin", unit="celsius").

Phase 6: Observe — Return Results to the Model¶

    messages.append({
        "role": "tool",
        "tool_call_id": tool_call.id,
        "content": json.dumps(result),
    })

Each tool result is appended with:

role: "tool" — tells the API this is a tool result (not user or assistant)
tool_call_id — links this result back to the specific tool call that requested it
content — the JSON-serialized return value

Then the while loop continues — back to Phase 3, sending the updated messages (now including tool results) to the model again.

Phase 7: Safety Valve¶

logger.warning(
    "[%s] Reached maximum iterations (%d)",
    agent.name, agent.max_iterations,
)
return messages[-1].get("content", "")

If the model keeps requesting tools beyond max_iterations (default: 10), we stop and return whatever we have. This prevents infinite loops if the model gets stuck.

Full Sequence Diagram¶

Here's a complete picture of what happens when run() processes a query that requires two rounds of tool calls:

sequenceDiagram
    participant Caller
    participant run as run()
    participant API as OpenAI API
    participant Tools as Tool Functions

    Caller->>run: run(agent, messages, client)
    Note over run: Insert system prompt

    rect rgb(240, 248, 255)
        Note over run: Iteration 1
        run->>API: chat.completions.create(**api_kwargs)
        API-->>run: tool_calls: [get_weather("Berlin")]
        run->>run: Append assistant message
        run->>Tools: get_weather(city="Berlin")
        Tools-->>run: {"temp": 18, "condition": "cloudy"}
        run->>run: Append tool result message
    end

    rect rgb(240, 255, 240)
        Note over run: Iteration 2
        run->>API: chat.completions.create(**api_kwargs)
        API-->>run: content: "It's 18°C in Berlin"
        run->>run: Append assistant message
    end

    run-->>Caller: "It's 18°C in Berlin"
    Note over Caller: messages list now has 5 entries

How `messages` Grows¶

Understanding how the messages list evolves is key. Here's what it looks like across iterations:

Step	Messages List
Start	`[{user: "What's the weather in Berlin?"}]`
After setup	`[{system: "You are..."}, {user: "What's the weather?"}]`
Iter 1 — model responds	`[..., {assistant: tool_calls=[get_weather]}]`
Iter 1 — tool result	`[..., {tool: {"temp": 18}}]`
Iter 2 — model responds	`[..., {assistant: "It's 18°C in Berlin"}]`

The list only grows — nothing is ever removed. This is why context management matters in long conversations (see Context Management).

Where `run()` Is Used¶

Every exercise from 03 onward uses this function:

Exercise	How it uses `run()`
03 — Single Agent	One agent, multiple turns, messages list grows across turns
04 — Sequential	Each agent gets a fresh messages list with only the previous output
05 — Concurrent	Multiple `run()` calls in parallel (via `ThreadPoolExecutor`)
06 — Brainstorm	Multiple agents share one messages list
07 — Handoff	Triage agent → structured output → specialist agent gets a new messages list
08 — Magentic	Manager dispatches tasks → workers run independently → results feed back

The function is always the same — only the orchestration around it changes. That's the whole point of this workshop.

Ready to practice?

Continue with the hands-on exercise in the sidebar (✏️) to apply what you've learned.