Order Triage
The platform’s most-exercised scenario. A customer email lands; within a single agent run, it gets classified, a refund decision gets made (informed by the customer’s history), and a customer-facing reply gets drafted — all without a human in the loop.
This is the scenario that makes the platform’s distinctive features concrete: multi-agent delegation, long-term memory, event-driven side effects, and YAML-defined agents.
The setup
Section titled “The setup”A small e-commerce store gets dozens of customer emails per day. Most are mundane: shipping questions, product questions, the occasional complaint. A meaningful slice are refund requests. Triaging them costs time; deciding on them costs judgment (small refunds are obvious, larger or repeat-customer cases are not); and replying to customers in the right tone costs care.
The order-triage scenario automates the first two of these and drafts the third for human review. Three agents collaborate:
triage— the front door. Classifies every incoming email; routes refund cases to the decision sub-agent; forwards everything else to a human queue.refund_decision— the policy authority. Recalls past cases for this customer, applies refund policy, decides: auto-approve, escalate, or deny.communication— the customer-facing voice. Drafts the reply email matching the decision, in the right tone, under 120 words.
The flow
Section titled “The flow”A typical refund case, end to end:
Triage receives the request:
- HTTP client → triage:
POST /runwith{ instructions: "Anna says her towel arrived damaged. She wants a refund.", payload: { email: "anna@example.com" } } - triage runs LLM turn 1, classifies the email as “refund or return request”
- triage runs LLM turn 2, calls
shopify_get_order_by_emailforanna@example.com— finds order#1042, total $49 SEK, ordered 12 days ago - triage runs LLM turn 3, decides to delegate. Calls
delegate_to_refund_decisionwith the order details and the customer’s reasoning
The runtime synthesizes the delegation tool from the sub-agent list and invokes runTurn() recursively:
- triage → refund_decision: sub-agent gets a fresh six-layer context — its own system prompt (the policy authority), its own characteristics (careful, evidence-citing), its own tool list
- refund_decision → long-term memory:
recall_memory("refund history for anna@example.com")returns 1 prior match: a successful $32 refund 4 months ago for a different item - refund_decision runs LLM turn 1, reasons:
- Order total $49 SEK, under the $50 auto-approve threshold ✓
- One prior refund 4 months ago, not a pattern ✓
- Reason given (damaged on arrival), on the auto-approve list ✓
- Order is 12 days old, well within the 30-day window ✓
- Decision:
auto_approve
- refund_decision →
shopify-actionsqueue:emit_eventwith{ action: "refund", order_id: "#1042", amount: 49, reason: "damaged on arrival" } - refund_decision → long-term memory:
store_memory("Approved $49 refund for order #1042, anna@example.com, reason: damaged on arrival, no prior pattern") - refund_decision delegates to
communication
The communication sub-agent drafts the reply:
- refund_decision → communication: sub-agent gets the decision (
auto_approve) and the context (order number, amount, reason) - communication runs LLM turn 1, picks the apologetic-warm-action-oriented tone block, drafts a 47-word reply confirming the refund and setting expectations for the 3-5 business day return
The chain unwinds:
- communication → refund_decision: returns the email body
- refund_decision → triage:
AgentReport { decision: "auto_approve", summary, email_draft } - triage → HTTP client:
200 OKwith the consolidated report
Side effects: the shopify-actions queue now has one event ready for a human-or-machine consumer to act on. In Phase 1 the consumer logs the event and stops; in Phase 2 it’ll execute the actual Shopify mutation behind a human approval gate.
What this demonstrates
Section titled “What this demonstrates”| Platform feature | How this scenario uses it |
|---|---|
| YAML agent definitions (ADR-0031) | All three agents are defined as YAML in apps/worker/agents/; prompts are separate Markdown files |
| Six-layer context (ADR-0006) | Each agent gets fresh layers 1-2 (its identity); layer 4 carries the delegated task; layers 5-6 carry runtime state and recalled memories |
| Delegation as tool (ADR-0022) | delegate_to_refund_decision and delegate_to_communication are synthesized at runtime from the sub-agent lists |
| Long-term memory (ADR-0030) | refund_decision calls recall_memory (read) and store_memory (write); memories scoped per-agent and per-tenant |
| Event emission (ADR-0032) | emit_event puts side-effect requests on the shopify-actions queue; consumers handle them async |
| Custom Shopify tool | shopify_get_order_by_email is a hand-registered tool in the Worker’s tool registry |
| Hard constraints in prompts | Each agent’s YAML lists its hard_constraints; the runtime composes them into the system prompt; the LLM enforces them |
| Model tiering (ADR-0008) | triage and refund_decision use model_tier: main (Sonnet); communication uses model_tier: sub (Haiku — faster and cheaper for short drafting) |
The artifacts
Section titled “The artifacts”The actual YAML and Markdown files that define this scenario. These are bundled into the deployed Worker (ADR-0033); they ship as part of the codebase, not as runtime config.
apps/worker/agents/triage.yaml
Section titled “apps/worker/agents/triage.yaml”apiVersion: agent-platform/v1kind: Agent
metadata: id: agent-triage name: triage version: 0.1.0 role: main tags: [crm, scenario-b]
model_tier: main
core_context: system_prompt: file: ./prompts/triage.md identity: an order-triage agent that classifies incoming customer emails and routes them hard_constraints: - never expose internal Shopify order IDs (gid://shopify/Order/...) in any output - always cite which tool returned the data you reference - never speculate about refund eligibility — that is the refund_decision agent's job
characteristics: personality: methodical, polite, neutral decision_style: balanced tone: professional
tools: - shopify_get_order_by_email - emit_event - delegate_to_refund_decision
sub_agents: - refund_decision
memory_config: working_memory_window: 10 long_term_enabled: false shared_context_scopes: []
autonomy: max_delegation_depth: 2 requires_human_approval: [] allowed_sub_agents: - refund_decision
escalation_rules: - condition: order_not_found target: human reason: cannot find an order matching the customer's email — human should verify the customer - condition: language_uncertain target: human reason: email language is unclear and confidence is lowapps/worker/agents/prompts/triage.md (excerpt)
Section titled “apps/worker/agents/prompts/triage.md (excerpt)”# Order Triage Agent
You are the triage agent for an e-commerce store. You receivecustomer emails and decide what kind of attention each one needs.You are the first point of contact in the platform'sorder-handling flow.
## Your job
For every incoming email, you must:
1. **Identify the customer's intent.** Most emails are one of: - Refund or return request — customer wants money back - Shipping question — where is my order - Product question — features, compatibility, sizing - Complaint — dissatisfaction without a specific request - Other — anything that doesn't fit cleanly
2. **Find the relevant order, if any.** Use `shopify_get_order_by_email` with the customer's email.
3. **Hand off appropriately.** - **Refund or return requests** with a found order → delegate to the `refund_decision` sub-agent - **Anything else** → emit a `human_review` event
## Hard rules
- Never expose internal order IDs (`gid://shopify/Order/...`)- Never speculate about refund eligibility yourself- Always cite which tool you called when explaining what you found
## Tone
Methodical. Polite. Neutral. You're a router, not a salesperson.The full prompt is ~50 lines and lives in the repo at
apps/worker/agents/prompts/triage.md.
apps/worker/agents/refund-decision.yaml
Section titled “apps/worker/agents/refund-decision.yaml”apiVersion: agent-platform/v1kind: Agent
metadata: id: agent-refund-decision name: refund_decision version: 0.1.0 role: sub_agent tags: [crm, scenario-b, policy]
model_tier: main
core_context: system_prompt: file: ./prompts/refund-decision.md identity: a refund-decision sub-agent that applies policy and recalls past cases hard_constraints: - always recall memory before deciding — non-negotiable - never auto-approve refunds over $50 - never deny without escalating to human review first - cite memory results in your reasoning - store the decision before completing
characteristics: personality: careful, evidence-citing, slightly skeptical of unusual cases decision_style: balanced tone: precise
tools: - recall_memory - store_memory - emit_event - delegate_to_communication
sub_agents: - communication
memory_config: working_memory_window: 15 long_term_enabled: true shared_context_scopes: []
autonomy: max_delegation_depth: 1 requires_human_approval: [] allowed_sub_agents: - communication
escalation_rules: - condition: amount_over_threshold target: human reason: order total is above the auto-approve threshold - condition: prior_refund_pattern target: human reason: customer has multiple prior refund requests — pattern needs review - condition: tracking_disputed target: human reason: customer claims non-receipt but tracking shows delivery — needs carrier follow-upapps/worker/agents/prompts/refund-decision.md (excerpt)
Section titled “apps/worker/agents/prompts/refund-decision.md (excerpt)”# Refund Decision Agent
You decide what to do about refund and return requests. You arethe policy authority.
## Your job
Produce one of three decisions:
1. **`auto_approve`** — refund issued without human review: - Order total under $50 - Customer's first refund (check long-term memory) - Reason on the auto-approve list (broken on arrival, wrong item, never received) - Order is within 30 days
2. **`escalate`** — send to human review: - Order total over $50 - Customer has prior refund requests - Reason is unusual or sounds like a policy edge case - Order is 30–90 days old - Anything that smells off
3. **`deny`** — refund refused (rare; only for clear cases): - Order older than 90 days - Customer has been refunded for the same item before - Reason is clearly unreasonable
## Process
1. **Recall memory.** Always start with `recall_memory`.2. **Reason explicitly.** Walk through the decision branch.3. **Emit the decision event.**4. **Store the decision.**
## Hard rules
- Always recall memory before deciding. Non-negotiable.- Never auto-approve over $50.- Never deny without escalating first.- Cite memory in your reasoning.- Store the decision before completing.apps/worker/agents/communication.yaml
Section titled “apps/worker/agents/communication.yaml”apiVersion: agent-platform/v1kind: Agent
metadata: id: agent-communication name: communication version: 0.1.0 role: sub_agent
model_tier: sub
core_context: system_prompt: file: ./prompts/communication.md identity: a customer-communication sub-agent that drafts reply emails matching the upstream decision hard_constraints: - never expose internal Shopify order IDs in customer-facing output - never reference the platform or that an AI is involved - sign every email "The team" — no fake personal names - never invent specifics like refund amounts not in the decision context - cap output at 120 words
characteristics: personality: warm but precise, never over-explains decision_style: conservative tone: professional, brief
tools: []
sub_agents: []
memory_config: working_memory_window: 10 long_term_enabled: false
autonomy: max_delegation_depth: 0 allowed_sub_agents: []The communication agent is intentionally minimal: no tools, no sub-agents, no memory. Pure drafting. Its job is to translate a decision into a customer-appropriate email.
Run it yourself
Section titled “Run it yourself”The full end-to-end demo against deployed Cloudflare
infrastructure is scripted in
apps/worker/scripts/e2e-demo.sh. Operationally:
# 1. Deploy the Worker (one-time)cd apps/workerpnpm wrangler deploy
# 2. Set required secrets (one-time)echo "your-anthropic-key" | pnpm wrangler secret put ANTHROPIC_API_KEYecho "your-openai-key" | pnpm wrangler secret put OPENAI_API_KEYecho "your-shopify-token" | pnpm wrangler secret put SHOPIFY_ACCESS_TOKENecho "$(openssl rand -hex 32)" | pnpm wrangler secret put WORKER_AUTH_TOKEN# (also: SHOPIFY_SHOP_DOMAIN as a plain config var)
# 3. Seed long-term memory with refund history fixtures (one-time)curl -X POST https://<your-worker>.workers.dev/admin/seed-memory \ -H "Authorization: Bearer $WORKER_AUTH_TOKEN"
# 4. Run the order-triage scenarioRUN_E2E=1 \WORKER_URL=https://<your-worker>.workers.dev \WORKER_AUTH_TOKEN="<your-token>" \./apps/worker/scripts/e2e-demo.shThe script asserts the expected tool_calls and decision
shape; it exits non-zero on regression. Total cost per run:
~$0.05 (Anthropic + OpenAI). Total wall time: ~30-60 seconds.
What’s next
Section titled “What’s next”This scenario sits at a deliberate boundary in Phase 1: agents
emit action events but don’t execute them. The
shopify-actions queue consumer logs and stops. Phase 2 closes
that loop:
- Real Shopify mutations — the queue consumer becomes a
Shopify writer.
refundevents trigger actual refunds. - Human approval gates — high-stakes decisions (large refunds, denial communications) wait for human approval before the consumer executes.
- Idempotency keys per
event.id— a delivered-twice event doesn’t refund twice. - Dead letter queues + retry policy — failed mutations get retried with exponential backoff; persistent failures land in a DLQ for human inspection.
The order-triage scenario stays the same; what changes is what happens after the agent emits the event.
Where to next
Section titled “Where to next”- Merchandising — the simpler, already-deployed scenario
- B2B SaaS hypothetical — what a different vertical looks like on the same platform
- Architecture — the synthesis page showing how this scenario maps to the platform’s surface
- ADR-0030 — long-term memory
— the design behind
recall_memory, the most distinctive part of this scenario