Exercise 07: Handoff Pattern¶
Objective¶
Implement dynamic routing where a triage agent analyzes queries and hands off to specialist agents.
Concepts Covered¶
- Triage / routing agent with classification
- Structured handoff context (dataclass with query, category, relevant info)
- Specialist agents with focused capabilities
- Context passing strategies (full history vs. summary vs. structured object)
How It Works¶
This is the first exercise that uses structured output for inter-agent communication. A Triage Agent classifies the incoming query using client.chat.completions.parse() with a Pydantic model, producing a structured TriageDecision. This decision is packaged into a HandoffContext dataclass and routed to the appropriate specialist.
flowchart TB
User["User Query:<br/>'I was charged twice<br/>for my subscription'"]
subgraph Triage["Triage Agent"]
T_SYS["System prompt:<br/>'Classify this query'"]
T_PARSE["chat.completions.parse()<br/>response_format=TriageDecision"]
T_DECISION["TriageDecision:<br/>category='billing'<br/>priority='high'<br/>reasoning='duplicate charge'<br/>extracted_info={...}"]
T_SYS --> T_PARSE --> T_DECISION
end
subgraph Handoff["HandoffContext"]
HC["original_query<br/>category<br/>priority<br/>context_summary<br/>extracted_info"]
end
subgraph Router["Dynamic Routing"]
direction LR
Billing["Billing<br/>Specialist<br/>Tools: lookup_invoice,<br/>process_refund"]
Tech["Technical<br/>Specialist<br/>Tools: check_status,<br/>restart_service"]
Account["Account<br/>Specialist<br/>Tools: get_account,<br/>update_account"]
end
User --> Triage
T_DECISION --> Handoff
Handoff --> |"category='billing'"| Billing
Handoff --> |"category='technical'"| Tech
Handoff --> |"category='account'"| Account
style Triage fill:#e8f4fd,stroke:#4a90d9
style Handoff fill:#fef9e7,stroke:#f39c12
style Router fill:#eafaf1,stroke:#2ecc71
The full message flow:
sequenceDiagram
participant User
participant Triage as Triage Agent
participant LLM as LLM Provider
participant Router as Route Logic
participant Spec as Specialist Agent
User->>Triage: "I was charged twice..."
Note over Triage,LLM: Structured Output (parse)
Triage->>LLM: chat.completions.parse()<br/>response_format=TriageDecision
LLM-->>Triage: TriageDecision(category="billing",<br/>priority="high", reasoning="...",<br/>extracted_info={...})
Triage->>Router: Build HandoffContext from TriageDecision
Router->>Router: Match category → select specialist
Note over Router,Spec: Fresh context for specialist
Router->>Spec: Fresh messages: [specialist_system_prompt,<br/>user(handoff_context summary)]
Spec->>LLM: messages + specialist tools
LLM-->>Spec: tool_calls (lookup_invoice)
Spec->>LLM: tool results
LLM-->>Spec: "I found the duplicate charge..."
Spec-->>User: Final resolution
Context sharing: Structured handoff. The triage agent's internal reasoning and raw messages are NOT passed to the specialist. Instead, only a structured HandoffContext object crosses the boundary, containing the original query, category, priority, and extracted information. The specialist receives a fresh messages list with this structured context as its input. This is a deliberate design choice — the specialist doesn't need to know how the triage agent reasoned, only what it concluded.
Structured output: Yes — this is the key feature of this exercise. client.chat.completions.parse() with response_format=TriageDecision ensures the LLM returns a valid, typed Pydantic object. This enables reliable routing (no string parsing for category) and structured extraction of relevant details for the specialist.
Why structured handoff matters
Compared to passing raw conversation history, a structured handoff object provides: (1) Reliable routing — category is a typed field, not a substring match. (2) Context compaction — the specialist gets only relevant info, not the full triage conversation. (3) Auditability — the reasoning field documents why the triage decision was made.
Message Flow: A Practical Example¶
This example follows a single customer query through the full handoff lifecycle. Unlike group chat (shared list) or concurrent (isolated lists), the handoff pattern creates a structured boundary between the triage agent and the specialist — they never share a messages list.
Step 1: Triage agent receives the query¶
The triage agent gets a fresh messages list with its system prompt and the customer's raw query:
# Triage agent's messages (completely self-contained)
triage_messages = [
{
"role": "system",
"content": "You are a customer support triage agent. Analyze the query and route to billing, technical, or account..."
},
{
"role": "user",
"content": "I was charged twice for my subscription last month and I want a refund"
}
]
Step 2: Triage agent returns structured output¶
Instead of a free-text reply, the triage agent returns a Pydantic-validated structured object via client.chat.completions.parse(response_format=TriageDecision):
# TriageDecision — the LLM's structured output (not a chat message!)
TriageDecision(
category="billing",
priority="high",
reasoning="Customer reports duplicate charge, requesting refund — route to billing specialist",
extracted_info={
"issue_type": "duplicate_charge",
"timeframe": "last month",
"desired_resolution": "refund"
}
)
Step 3: Structured context is packaged for handoff¶
The orchestrator converts the triage decision into a HandoffContext dataclass — a clean boundary between agents:
# HandoffContext — the only data that crosses the agent boundary
HandoffContext(
customer_query="I was charged twice for my subscription last month and I want a refund",
category="billing",
priority="high",
extracted_info={
"issue_type": "duplicate_charge",
"timeframe": "last month",
"desired_resolution": "refund"
}
)
Step 4: Specialist gets a fresh messages list¶
The specialist does NOT inherit any messages from the triage agent. Instead, it receives a brand-new list built from the handoff context:
# Billing specialist's messages — completely fresh, no triage history
specialist_messages = [
{
"role": "user",
"content": (
"Customer query: I was charged twice for my subscription last month and I want a refund\n"
"Priority: high\n"
"Extracted details: {\"issue_type\": \"duplicate_charge\", "
"\"timeframe\": \"last month\", \"desired_resolution\": \"refund\"}\n\n"
"Please resolve this customer's issue."
)
}
]
The specialist then runs its own tool loop (e.g., calling lookup_invoice) with this context until it reaches a resolution.
What crosses the boundary¶
| Data | Passed to specialist? | How? |
|---|---|---|
| Customer's original query | Yes | Via HandoffContext.customer_query |
| Triage category + priority | Yes | Via HandoffContext.category / .priority |
| Extracted details (issue type, timeframe, etc.) | Yes | Via HandoffContext.extracted_info dict |
| Triage agent's system prompt | No | Specialist has its own |
| Triage agent's internal reasoning | No | Stays in TriageDecision.reasoning for logging only |
| Triage agent's raw messages list | No | Specialist starts fresh |
Files¶
01_support_triage.py— Triage agent routes to billing, technical, or account specialists
How to Run¶
Expected Output¶
Logging showing the triage classification, handoff decision with reasoning, structured context passed to the specialist, and the specialist's resolution.
Next¶
→ Next: Exercise 08: Magentic Pattern