ADR-0021 — Agent runtime and tool loop
The execution engine. How an LLM call becomes an agent turn.
What this decision settles
Section titled “What this decision settles”By the time this ADR was written, the platform had an LLM adapter interface (ADR-0019) and an Anthropic concrete (ADR-0020). It could make LLM calls but couldn’t run an agent.
The missing piece is the tool loop. An agent turn is rarely one LLM call — the model requests tools, the runtime executes them, the results feed back into another LLM call, repeat until done.
Every consumer that runs agents would otherwise write this loop themselves, with inconsistent budget handling, inconsistent error translation, inconsistent observability. That’s a recipe for the platform’s quality bars eroding one consumer at a time.
This ADR settles:
- Where the tool loop lives.
@agent-platform/runtime. - How tool resolution works. A
ToolResolverinterface incore; concrete registry in a sibling package. - The factory shape.
createAgentRuntime({adapter, toolResolver, logger, maxIterations?}). - The loop’s contract. Iteration cap; turn-level budget
enforcement; structured
AgentReportoutput; deterministic stop conditions.
Why this matters
Section titled “Why this matters”The tool loop is where the safety properties become real. The six-layer context model (ADR-0006) prescribes how context is assembled. The agent runtime is what runs the LLM with that context, executes the tools the LLM asks for, enforces budgets on every iteration, and produces a structured turn record.
Without a centralized runtime:
- Budgets get checked once per LLM call, not once per turn — meaning a turn with five LLM calls can blow through 5x the intended budget.
- Tool errors get translated differently per consumer — some raise, some return, some swallow. Debugging gets harder.
- Iteration limits are ad-hoc. A buggy tool that always asks for another tool can run forever.
The runtime is the single point where these invariants get enforced.
The decision
Section titled “The decision”The shape is intentionally small:
interface AgentRuntime { runTurn( definition: AgentDefinition, task: Task, context: ContextBundle, ): Promise<AgentReport>;}
function createAgentRuntime(opts: { adapter: ModelAdapter; toolResolver: ToolResolver; logger: Logger; maxIterations?: number;}): AgentRuntime;Consumers see one method: runTurn. Everything else — the loop,
the budget bookkeeping, the error translation — is internal.
The loop, in pseudocode
Section titled “The loop, in pseudocode”for iteration in 0..maxIterations: check turn-level budgets (time, cost) → throw TurnBudgetExceededError if exceeded
build ModelRequest with remaining-budget-for-this-call call adapter.generate(...)
if stop_reason == 'end_turn' or 'stop_sequence': return AgentReport (success) if stop_reason == 'max_tokens': return AgentReport with truncation risk if stop_reason == 'tool_use': for each tool_use block: enforce allowed_tools → AutonomyBoundaryError on violation resolve tool → if null, tool_result with is_error=true (soft) execute tool → capture result or error message feed tool_results back into the next iteration's ModelRequest
if iteration cap hit without end_turn: throw IterationCapExceededErrorA few design choices baked into this:
- Budgets are checked at iteration boundaries, not just per
LLM call. This is what makes
task.constraints.time_budget_msactually enforceable. - Unknown tool name → soft error, not a throw. The runtime
returns a
tool_resultwithis_error=trueso the LLM can recover (“you tried to calldoes_not_exist; try a different tool”). Throwing would be the resolver saying “I am broken,” which is a different signal. AutonomyBoundaryErroris a hard throw. If the agent tries to call a tool not in its allow-list, that’s a definition violation, not a recoverable LLM mistake. The agent’s allow-list is enforced at the runtime layer, not trusted to the LLM.
What gets returned
Section titled “What gets returned”AgentReport is the structured turn record. It includes:
- The agent’s final response (text)
- The tool calls it made and their results
- The total LLM cost (in tokens and dollars)
- The total wall time
- The number of iterations
- Any errors caught and softened
Every turn produces one of these; observability hangs off it.
What this decision does NOT do
Section titled “What this decision does NOT do”- Does not handle delegation. Sub-agent calls are tools (see ADR-0022). The runtime doesn’t know about them; the toolResolver does. This is the unification that keeps the runtime small.
- Does not handle streaming. A future runtime variant could stream; today’s runtime returns the full report at the end. The interface is shaped so streaming can be added without breaking changes.
- Does not handle multi-agent orchestration above the agent level. That’s the application layer. The runtime runs one agent turn; what to do with the report is the caller’s problem.
Trade-offs
Section titled “Trade-offs”- Iteration caps are a blunt instrument. A pathological
agent can hit the cap; it gets an
IterationCapExceededError. We picked a reasonable default (10) and made it configurable per call. Real abuse cases probably need cost/time budgets rather than iteration counts. - The runtime is opinionated about error semantics. Some consumers might prefer different soft/hard categorizations. We picked one set and committed to it; alternative runtime implementations can ship with different choices.
Where to next
Section titled “Where to next”For the original ADR with full Context / Decision / Consequences / Alternatives sections, see ADR-0021 source.
Related decisions: