ADR-0022 — Delegation as tool
The unification trick. The single design choice that keeps the runtime small enough to fit in your head.
What this decision settles
Section titled “What this decision settles”Multi-agent systems need a way for one agent to invoke another. “Triage” needs to delegate to “refund_decision”; “refund_decision” needs to delegate to “communication.” How does that work mechanically?
The instinct is to add a separate API: a delegate() function,
maybe a Delegation type, a special control flow path in the
runtime that branches on “is this a delegation or a tool call?”
That’s two execution paths to maintain and two sets of failure
modes to reason about.
This ADR settles a different answer: delegation IS a tool
call. There is no separate delegation API. When agent A lists
agent B as a sub-agent, the runtime synthesizes a tool named
delegate_to_<B> whose input is the task spec for B. The
runtime executes that tool the same way it executes any other
tool — recursively invoking itself with B’s definition.
Why this matters
Section titled “Why this matters”The unification has three downstream consequences that compound:
One execution path. The tool loop in ADR-0021 doesn’t need a “is this a delegation?” branch. Every iteration just asks: did the LLM stop, or did it ask for a tool? If a tool, run it. The fact that the tool happens to be a recursive runtime call is the toolResolver’s concern, not the loop’s.
One set of safety primitives. Budgets, allow-lists,
iteration caps, error semantics — they all apply uniformly.
A sub-agent’s time_budget_ms is just the parent tool call’s
remaining time budget. A sub-agent’s allowed_tools is enforced
the same way the parent’s was. There’s no “sub-agent budget” or
“delegation depth limit” as separate concepts; everything is just
nested tool calls with budget-aware composition.
One set of failure modes. A sub-agent that throws gets caught the same way a tool that throws gets caught. A sub-agent that exceeds budgets surfaces the same error type as any other budget exceedance. Debugging is uniform.
The decision
Section titled “The decision”When an agent definition lists another agent as a sub-agent, the platform synthesizes a tool named
delegate_to_<sub_agent_name>whose input is a structured task spec. The tool’s handler invokes the runtime recursively with the sub-agent’s definition.
// Auto-generated tool for an agent whose sub_agents list// includes "refund_decision":{ name: 'delegate_to_refund_decision', description: '<from refund_decision YAML>', inputSchema: TaskSpecSchema, handler: async (input, ctx) => { return await runtime.runTurn( refundDecisionDefinition, taskFromSpec(input), ctx.contextBundleForSubAgent(), ); },}The agent’s LLM sees delegate_to_refund_decision in its tool
list. It calls it. The runtime executes it. The execution
happens to recurse. The agent doesn’t know or care.
Why not a separate API
Section titled “Why not a separate API”We considered three alternatives:
A separate delegate() function in the runtime. Forces a
branch in the tool loop (“is this a delegation or a tool call?”).
Doubles the code path. Doubles the testing surface. Three
months later, every new feature has to ask “do we apply this to
delegations or tools or both?”
A Delegation type the LLM emits explicitly. The LLM has
to know about delegation as a first-class concept. Means the
prompt has to teach the LLM the distinction. Adds words to
every system prompt for no benefit.
Multi-agent orchestration above the runtime. Move delegation out of the runtime entirely; have the application layer parse agent reports and decide whether to invoke another agent. Possible, but loses the one safety property that matters: you can’t enforce budget composition across agents that don’t share an execution context.
What this enables
Section titled “What this enables”- Arbitrary delegation depth with budget-aware composition.
Agent A delegates to B which delegates to C which delegates
to D — each layer’s
time_budget_msis correctly composed as a remaining-budget. The platform enforces a configurable max-depth (default 3) to prevent runaway recursion. - The LLM picks the right sub-agent. The agent’s prompt
says “you have a
delegate_to_refund_decisiontool that handles refund decisions.” The LLM treats it like any other tool — uses the description to decide when to call. No new prompting pattern. - Sub-agents are pluggable. Adding a new sub-agent is
editing the YAML’s
sub_agentslist. The runtime synthesizes the new delegation tool at next turn.
Trade-offs
Section titled “Trade-offs”- The synthesized tool’s input schema must be expressive. A task spec needs instructions, payload, expected output, time budget, autonomy bounds. We picked a schema that handles Phase 1’s needs; complex multi-step delegation chains might push on it.
- Delegation depth is global, not per-agent. A misconfigured agent that recurses into itself hits the global max-depth. Per-agent depth limits are a future refinement.
- Per-tool error categorization applies. A sub-agent that
fails for an
AutonomyBoundaryErrorreason returns atool_resultwithis_error=true. The parent agent can recover. This is sometimes the right behavior and sometimes not — a security-critical sub-agent failure probably should not be recoverable. Today’s answer is “code your sub-agent to throw a hard error type”; future ADRs might add a declarative way to mark some sub-agent failures as un-recoverable.
Where to next
Section titled “Where to next”For the original ADR with full Context / Decision / Consequences / Alternatives sections, see ADR-0022 source.
Related decisions: