ADR-0031 — YAML agent definition format

What an agent IS in this system.

What this decision settles

Until this ADR, every agent definition was authored in TypeScript: apps/example/src/agents.ts, apps/worker/src/agents.ts, apps/worker/src/merchandising.ts. This worked for scaffolding, but it isn’t the platform vision.

The vision is agents are data: editable by operators, managed in version control, and (eventually) configurable through a UI without recompiling the Worker.

This ADR settles the on-disk format and the loader contract that produces validated AgentDefinition instances from that format.

Why this matters

If an agent is TypeScript code, only engineers can write one. Adding a new agent means a code review, a deploy, a release. The platform’s audience — operators, business owners, eventually non-technical authors — is locked out.

If an agent is data, a YAML file and a sensible loader can be:

Reviewed by anyone who can read the schema
Edited in a UI without compiling
Hot-reloaded in dev without restarting
Tested independently of the runtime
Versioned alongside business decisions, not platform code

This is the difference between “a platform people use” and “a codebase engineers contribute to.”

The decision

Agent definitions live as YAML files on disk. A new @agent-platform/agent-loader package reads, validates, and resolves them into AgentDefinition instances at load time (Worker startup, or build time for bundled deployments).

The shape is fixed by the existing AgentDefinition type in @agent-platform/core. The loader’s job is to validate YAML input against AgentDefinitionSchema and produce the typed, frozen instance the runtime expects.

What an agent file looks like

A representative example — the actual triage.yaml in apps/worker/agents/:

metadata:
  id: agent-triage
  name: triage
  version: 0.1.0
  role: main

core_context:
  system_prompt:
    file: ./triage.prompt.md
  hard_constraints:
    - 'Never auto-approve a refund over $500.'
    - 'Never disclose customer data to other customers.'

characteristics:
  decision_style: 'concise; favor escalation on uncertainty'
  tone: 'professional, neutral'

tools:
  allowed:
    - shopify_get_order_by_email
    - emit_event

sub_agents:
  - refund_decision

memory_config:
  long_term_enabled: false
  working_memory_window: 20

autonomy:
  max_delegation_depth: 2
  human_approval_required_for: []

model:
  tier: main
  provider: anthropic
  name: claude-sonnet-4-5-20250929

That’s a complete agent. No code; just data.

The seven format choices

The ADR deliberated seven specific decisions:

1. One YAML file per agent. Not multi-agent files. Multi- agent files invite ordering games (one agent’s escalation rule references a sibling agent in the same file), tangle PR review (a one-line edit is buried in a long file), and don’t map cleanly to a future “list of agents → click to edit” UI.

2. System prompt: schema accepts string OR { file: <path> }. Production system prompts are often 100+ lines. Editing them inside YAML’s quoting rules is painful; markdown editors give syntax highlighting; PR diffs of prompt edits read cleanly when the prompt is its own file. But forcing a separate file for every two-line sub-agent is overkill. Both forms are supported.

3. Schema validation: Zod. Consistent with the rest of the codebase. The existing AgentDefinitionSchema in @agent-platform/schemas is the validation target — the loader adapts YAML input to that schema’s shape rather than adding new schemas.

4. Tool references by name. The agent says tools.allowed: [shopify_get_order_by_email]. The loader doesn’t resolve names to handlers — that happens at runtime via the ToolResolver. Loose binding; tools and agents are independently editable.

5. Sub-agent references by name. Same shape as tools. The agent’s YAML lists sub_agents: [refund_decision]; the runtime synthesizes the delegation tool at turn time.

6. Model tier as data, not behavior. model.tier: main or sub_agent; model.name: claude-sonnet-4-5-20250929. The agent’s YAML doesn’t decide whether to call the LLM — that’s the runtime’s job. It decides which model.

7. Hard constraints as a string array. Not a structured DSL. The constraints get composed into the system prompt verbatim; the LLM enforces them. We considered a structured constraint language and rejected it as premature — Phase 1 has no use case that the LLM can’t handle in plain English.

The loader

const loader = createAgentLoader({ logger });
const def = await loader.loadFromFile('./agents/triage.yaml');

The loader:

Reads the YAML file from disk
Resolves any { file: ./prompts/x.md } references to strings
Validates the resolved object against AgentDefinitionSchema
Returns a frozen AgentDefinition instance

If validation fails, it throws AgentDefinitionError with the full Zod issue list. The Worker’s startup catches this and refuses to come up — better to crash on startup than to deploy with a broken agent.

What got considered and rejected

TypeScript agent definitions (the status quo). Worked for scaffolding; doesn’t scale to non-engineers.
JSON instead of YAML. YAML’s multi-line strings make prompt-inline scenarios bearable; JSON’s \n escapes don’t. YAML wins on author ergonomics.
A custom DSL. Premature. YAML + Zod gets us 95% of the way there with no custom parser to maintain.
One big agents.yaml for all agents in a deployment. See decision 1. Rejected.

Where to next

For the original ADR with full Context / Decision / Consequences / Alternatives sections, see ADR-0031 source.

Related decisions:

ADR-0033 — Worker bundling for YAML files (how the agents reach the deployed Worker)
ADR-0006 — six-layer context system (the agent owns layers 1 and 2)
ADR-0022 — delegation as tool (the runtime mechanic that consumes the sub_agents list)