Skip to content

Exercise 07: Handoff Pattern

Objective

Implement dynamic routing where a triage agent analyzes queries and hands off to specialist agents.

Concepts Covered

  • Triage / routing agent with classification
  • Structured handoff context (dataclass with query, category, relevant info)
  • Specialist agents with focused capabilities
  • Context passing strategies (full history vs. summary vs. structured object)

How It Works

This is the first exercise that uses structured output for inter-agent communication. A Triage Agent classifies the incoming query using client.chat.completions.parse() with a Pydantic model, producing a structured TriageDecision. This decision is packaged into a HandoffContext dataclass and routed to the appropriate specialist.

flowchart TB
    User["User Query:<br/>'I was charged twice<br/>for my subscription'"]

    subgraph Triage["Triage Agent"]
        T_SYS["System prompt:<br/>'Classify this query'"]
        T_PARSE["chat.completions.parse()<br/>response_format=TriageDecision"]
        T_DECISION["TriageDecision:<br/>category='billing'<br/>priority='high'<br/>reasoning='duplicate charge'<br/>extracted_info={...}"]
        T_SYS --> T_PARSE --> T_DECISION
    end

    subgraph Handoff["HandoffContext"]
        HC["original_query<br/>category<br/>priority<br/>context_summary<br/>extracted_info"]
    end

    subgraph Router["Dynamic Routing"]
        direction LR
        Billing["Billing<br/>Specialist<br/>Tools: lookup_invoice,<br/>process_refund"]
        Tech["Technical<br/>Specialist<br/>Tools: check_status,<br/>restart_service"]
        Account["Account<br/>Specialist<br/>Tools: get_account,<br/>update_account"]
    end

    User --> Triage
    T_DECISION --> Handoff
    Handoff --> |"category='billing'"| Billing
    Handoff --> |"category='technical'"| Tech
    Handoff --> |"category='account'"| Account

    style Triage fill:#e8f4fd,stroke:#4a90d9
    style Handoff fill:#fef9e7,stroke:#f39c12
    style Router fill:#eafaf1,stroke:#2ecc71

The full message flow:

sequenceDiagram
    participant User
    participant Triage as Triage Agent
    participant LLM as LLM Provider
    participant Router as Route Logic
    participant Spec as Specialist Agent

    User->>Triage: "I was charged twice..."

    Note over Triage,LLM: Structured Output (parse)
    Triage->>LLM: chat.completions.parse()<br/>response_format=TriageDecision
    LLM-->>Triage: TriageDecision(category="billing",<br/>priority="high", reasoning="...",<br/>extracted_info={...})

    Triage->>Router: Build HandoffContext from TriageDecision
    Router->>Router: Match category → select specialist

    Note over Router,Spec: Fresh context for specialist
    Router->>Spec: Fresh messages: [specialist_system_prompt,<br/>user(handoff_context summary)]
    Spec->>LLM: messages + specialist tools
    LLM-->>Spec: tool_calls (lookup_invoice)
    Spec->>LLM: tool results
    LLM-->>Spec: "I found the duplicate charge..."
    Spec-->>User: Final resolution

Context sharing: Structured handoff. The triage agent's internal reasoning and raw messages are NOT passed to the specialist. Instead, only a structured HandoffContext object crosses the boundary, containing the original query, category, priority, and extracted information. The specialist receives a fresh messages list with this structured context as its input. This is a deliberate design choice — the specialist doesn't need to know how the triage agent reasoned, only what it concluded.

Structured output: Yes — this is the key feature of this exercise. client.chat.completions.parse() with response_format=TriageDecision ensures the LLM returns a valid, typed Pydantic object. This enables reliable routing (no string parsing for category) and structured extraction of relevant details for the specialist.

Why structured handoff matters

Compared to passing raw conversation history, a structured handoff object provides: (1) Reliable routing — category is a typed field, not a substring match. (2) Context compaction — the specialist gets only relevant info, not the full triage conversation. (3) Auditability — the reasoning field documents why the triage decision was made.

You are a support triage agent. Classify the customer query by category (billing, account, technical) and priority (high, medium, low). Extract relevant details for the specialist.
I was charged twice for order ORD-1001. I need a refund for the duplicate charge.
category: billing | priority: high | reasoning: Duplicate charge for a specific order requires immediate financial correction | extracted_info: order_id=ORD-1001, issue=duplicate charge
customer_query: I was charged twice for order ORD-1001... | category: billing | priority: high | extracted_info: {order_id: ORD-1001, issue: duplicate charge}
You are a billing specialist. Resolve billing issues using your tools: lookup_order, process_refund.
Customer reports duplicate charge for order ORD-1001 and requests a refund. Priority: high. Please investigate and resolve.
lookup_order(order_id='ORD-1001') then process_refund(order_id='ORD-1001', reason='Duplicate charge')
{"refund_status": "approved", "refund_amount": 114.98, "estimated_days": 6, "reference": "REF-32879"}
The duplicate charge for order ORD-1001 has been refunded. Refund amount: $114.98. Reference: REF-32879. You should see it in your account within 6 business days.

Message Flow: A Practical Example

This example follows a single customer query through the full handoff lifecycle. Unlike group chat (shared list) or concurrent (isolated lists), the handoff pattern creates a structured boundary between the triage agent and the specialist — they never share a messages list.

Step 1: Triage agent receives the query

The triage agent gets a fresh messages list with its system prompt and the customer's raw query:

# Triage agent's messages (completely self-contained)
triage_messages = [
    {
        "role": "system",
        "content": "You are a customer support triage agent. Analyze the query and route to billing, technical, or account..."
    },
    {
        "role": "user",
        "content": "I was charged twice for my subscription last month and I want a refund"
    }
]

Step 2: Triage agent returns structured output

Instead of a free-text reply, the triage agent returns a Pydantic-validated structured object via client.chat.completions.parse(response_format=TriageDecision):

# TriageDecision — the LLM's structured output (not a chat message!)
TriageDecision(
    category="billing",
    priority="high",
    reasoning="Customer reports duplicate charge, requesting refund — route to billing specialist",
    extracted_info={
        "issue_type": "duplicate_charge",
        "timeframe": "last month",
        "desired_resolution": "refund"
    }
)

Step 3: Structured context is packaged for handoff

The orchestrator converts the triage decision into a HandoffContext dataclass — a clean boundary between agents:

# HandoffContext — the only data that crosses the agent boundary
HandoffContext(
    customer_query="I was charged twice for my subscription last month and I want a refund",
    category="billing",
    priority="high",
    extracted_info={
        "issue_type": "duplicate_charge",
        "timeframe": "last month",
        "desired_resolution": "refund"
    }
)

Step 4: Specialist gets a fresh messages list

The specialist does NOT inherit any messages from the triage agent. Instead, it receives a brand-new list built from the handoff context:

# Billing specialist's messages — completely fresh, no triage history
specialist_messages = [
    {
        "role": "user",
        "content": (
            "Customer query: I was charged twice for my subscription last month and I want a refund\n"
            "Priority: high\n"
            "Extracted details: {\"issue_type\": \"duplicate_charge\", "
            "\"timeframe\": \"last month\", \"desired_resolution\": \"refund\"}\n\n"
            "Please resolve this customer's issue."
        )
    }
]

The specialist then runs its own tool loop (e.g., calling lookup_invoice) with this context until it reaches a resolution.

What crosses the boundary

Data Passed to specialist? How?
Customer's original query Yes Via HandoffContext.customer_query
Triage category + priority Yes Via HandoffContext.category / .priority
Extracted details (issue type, timeframe, etc.) Yes Via HandoffContext.extracted_info dict
Triage agent's system prompt No Specialist has its own
Triage agent's internal reasoning No Stays in TriageDecision.reasoning for logging only
Triage agent's raw messages list No Specialist starts fresh

Files

  1. 01_support_triage.py — Triage agent routes to billing, technical, or account specialists

How to Run

python exercises/07_handoff/01_support_triage.py

Expected Output

Logging showing the triage classification, handoff decision with reasoning, structured context passed to the specialist, and the specialist's resolution.

Next

→ Next: Exercise 08: Magentic Pattern