Core concepts
The platform is built around four primitives. Once you understand them, every example, every scenario, every architectural choice makes sense as variations on these four ideas.
Agents
Section titled “Agents”An agent is a definition file that says who an LLM-powered actor is and what it’s allowed to do. In the platform, agent definitions are YAML — one file per agent.
Every agent has:
- An identity. A name, a role (
mainfor top-level agents that receive tasks;sub_agentfor ones that only get delegated to), and a model tier (mainagents typically run on Claude Sonnet; sub-agents on Haiku to save cost). - A system prompt. The persistent instructions that define how the agent thinks. This is where business expertise gets encoded: tone, decision criteria, escalation policy, what counts as “good” work for this agent’s role.
- A characteristics block. Personality, decision style, tone. Slightly fuzzier than the system prompt but reaches the model the same way; mostly useful for keeping agent voice consistent across responses.
- A tools list. The names of tools this agent is permitted to use. Even if a tool exists in the platform, an agent can only call it if it’s listed here. This is structural, not advisory — the runtime enforces it.
- A sub-agents list. Other agents this agent can delegate to.
Delegation is a tool call: when
triagedecides it needs the refund-decision agent’s judgment, it calls adelegate_to_refund_decisiontool with structured inputs, and the runtime spins up the sub-agent with its own context, runs its own loop, and returns a structured report. - A memory configuration. Whether long-term memory is enabled
for this agent (most aren’t — only agents that benefit from
persistent recall, like
refund_decision, opt in), and what the working-memory window size is. - Autonomy bounds. How deep the delegation chain can go from this agent (so a buggy agent can’t recurse forever). What human approvals are required before certain actions can complete (Phase 2 territory).
That’s the entire agent. No code, no class hierarchy, no orchestration logic. The runtime reads the YAML, assembles the right context, and runs the loop.
A tool is one specific thing an agent can do. The runtime defines the interface: a tool has a name, a description (the LLM reads this to decide whether to call it), an input schema (Zod-validated), and a handler function that does the work and returns structured output.
Three categories of tools exist on the platform:
- Built-in tools that every agent can opt into:
recall_memory(semantic search over the agent’s long-term memory),store_memory(write a new entry), andemit_event(publish an event onto the bus for downstream handling). - Business-pack tools that come from a vertical-specific package.
In the e-commerce pack,
shopify_get_order_by_emaillooks up recent orders for a customer. Future Phase 2 tools will mutate Shopify (refund, cancel, annotate) once the human-approval gate is designed. - Delegation tools that are auto-generated from each agent’s
sub_agentslist. Iftriagelistsrefund_decisionas a sub-agent, the runtime synthesizes adelegate_to_refund_decisiontool whose input is the task spec for the sub-agent.
Tools are the security boundary. An agent can only do what its tools permit. Adding a new capability is shipping a new tool — not editing the agent.
Memory
Section titled “Memory”Memory is the platform’s hardest, most distinctive idea. Every agent reads from up to six layers of context, in strict priority order. Higher layers can never be overridden by lower ones. The runtime enforces this.
| Layer | What it is | Mutability |
|---|---|---|
| 1. Core context | System prompt + hard constraints from the YAML | Immutable. Compiled once, frozen. |
| 2. Characteristics | Personality, decision style, tone | Immutable. Same. |
| 3. Shared context | Read-only data shared across agents (e.g., today’s date, the current tenant) | Read-only. Set per request. |
| 4. Delegated context | Per-task input from a parent agent (the sub-agent’s instructions, payload, expected output schema) | Per-task. Set when delegated. |
| 5. Working memory | The current conversation: messages, tool results, intermediate reasoning | Sliding window per turn. |
| 6. Long-term memory | Persistent vector search per agent, across all past turns | Read-write across turns; opt-in per agent. |
The priority order is the security model. A hostile or buggy
sub-agent cannot override the core context of its parent. A user
input cannot override the agent’s system prompt. A retrieved memory
cannot override the agent’s hard constraints. The platform calls
this the validateNoOverride() guarantee, and the runtime
applies it on every context assembly.
Long-term memory specifically is implemented as Cloudflare Vectorize
(vector search for semantic recall) plus Cloudflare D1 (the source-
of-truth row store). When an agent calls recall_memory("refund history for sara@example.com"), the runtime embeds the query, searches
Vectorize for the top matches, hydrates the full content from D1, and
hands the results back to the agent as a structured tool result. The
agent then reasons over what it found.
This is why agents remember — not because the model has long context windows, but because the platform persists the right things and surfaces them on demand.
Events
Section titled “Events”An agent never directly mutates the outside world. When refund_decision
decides Sara’s refund should be auto-approved, it doesn’t call
Shopify’s refund API. It emits an event:
topic: shopify_actionspayload: { action_type: "refund", order_id: "gid://shopify/Order/12345", amount: "49.00", currency_code: "SEK", reason: "Auto-approve: under $50, first refund, within 30 days", decided_by_agent: "agent-refund-decision"}That event lands in a Cloudflare Queue. A separate consumer (today logs-only; Phase 2 will execute the mutation) picks it up, validates the payload, and acts on it.
This separation is the platform’s safety property. The agent recommends; the consumer acts. A human approval gate fits naturally between them: the consumer can route certain events to a human-review queue, wait for approval, and only then execute. The agent’s reasoning is captured for audit; the action’s authorization is captured separately.
Two topics exist today:
human_review— events emitted when an agent decides a case needs human attention. Today’s consumer just logs them; Phase 2 introduces a real review UI.shopify_actions— events emitted when an agent decides something should happen in Shopify. Same logs-only treatment in v1; Phase 2 wires real mutations.
How they fit together
Section titled “How they fit together”A single triage scenario weaves all four:
- An HTTP request hits
/runwith{ agent_name: "triage", instructions: "...", payload: {...} }. The runtime loads the triage agent from YAML. - The agent’s system prompt and characteristics (layers 1+2) are baked into its context.
- Shared context (current date, etc.) and delegated context (the request’s instructions and payload) get added (layers 3+4).
- Working memory starts empty (layer 5).
- Long-term memory isn’t enabled for
triageitself — it’s enabled forrefund_decision(layer 6 only matters when the chain reaches that sub-agent). - The triage agent’s LLM turn runs. It sees its allowed tools:
shopify_get_order_by_email,emit_event, and the auto-generateddelegate_to_refund_decision. - It calls
shopify_get_order_by_emailto look up the customer’s recent orders. - It then calls
delegate_to_refund_decisionwith a structured task. The runtime spins uprefund_decisionas a sub-agent with its OWN six-layer context (its own system prompt, its own tools, its own long-term memory). refund_decisioncallsrecall_memoryagainst its long-term memory store. The runtime embeds the query, searches Vectorize, hydrates from D1, returns the matches.refund_decisionreasons, then either callsdelegate_to_communication(delegate further), oremit_event(publish to the bus), or both. It returns a structured report to its parent.triagesees the report, may emit its own events, and returns a final summary to the caller.
Every step uses one or more of the four primitives. There is no fifth primitive. Once you have these, you have multi-agent business automation.
Where to go next
Section titled “Where to go next”- Glossary — quick reference for terms used across the docs
- Tech section (commit 2) — how each of these primitives is implemented in code
- Scenarios section (commits 5–7) — concrete walkthroughs that build intuition for the patterns