Demo
How to deploy the platform yourself, seed it with fixtures, and verify the full order-triage flow runs against real Cloudflare infrastructure (D1, Vectorize, Queues, Durable Objects) plus real Anthropic, OpenAI, and Shopify APIs.
This is the page that turns “the platform exists” into “I just ran it.” Total cost: ~$0.05 per end-to-end run. Total wall time: ~30-60 seconds.
What you’ll see
Section titled “What you’ll see”After the deploy and the four-step verification, you’ll have:
- A Worker deployed at
<your-worker>.workers.devanswering HTTP requests - 10 refund-history fixtures seeded into long-term memory (Vectorize + D1)
- A successful
refund_decisionagent run that recalls the fixtures and reasons over them - A successful
triageagent run that delegates the full chain — triage → refund_decision → communication — and returns a structuredAgentReport
The script asserts the expected shape at every step. It exits non-zero on regression, which makes it usable as a smoke test on every deploy.
What you’ll need
Section titled “What you’ll need”Before starting:
- A Cloudflare account — free tier is fine for the demo; Workers + D1 + Vectorize + Queues + Durable Objects all have generous free quotas
- An Anthropic API key — the agents use Claude Sonnet 4.5 (main) and Haiku 4.5 (sub-agents). Demo cost: ~$0.04 per run.
- An OpenAI API key — embeddings only (
text-embedding-3-small). Demo cost: ~$0.001 per run. - A Shopify Admin access token — read-only scopes are enough. The demo script uses a single test order lookup. If you don’t have a Shopify store, you can stub the tool; most of the demo runs without it.
- Local toolchain: Node 22+, pnpm 10+,
wranglerCLI,jq, andcurl
The demo, in five sections
Section titled “The demo, in five sections”1. Clone and install
Section titled “1. Clone and install”git clone https://github.com/cellmanguney/agent-platform.gitcd agent-platformpnpm installThis installs all 16 packages and 2 apps. First install takes ~1 minute; subsequent installs are fast (pnpm is content-addressed).
2. Provision Cloudflare resources
Section titled “2. Provision Cloudflare resources”The platform’s Worker needs five Cloudflare bindings: a D1
database, a Vectorize index, two Queues, a KV namespace, and a
Durable Object class. The wrangler.toml declares them; you
create them once.
cd apps/worker
# 1. D1 database (SQL — long-term memory rows + jobs)pnpm wrangler d1 create agent-platform# Note the database_id; paste it into wrangler.toml under# [[d1_databases]] -> database_id
# 2. Apply the schemapnpm wrangler d1 execute agent-platform --file=./schema/long_term_memory.sql --remote
# 3. Vectorize index (semantic search over memory)pnpm wrangler vectorize create agent-platform-lt-memory \ --dimensions=1536 \ --metric=cosine
# 4. Vectorize metadata indexes (for tenant_id and agent_id filtering)pnpm wrangler vectorize create-metadata-index agent-platform-lt-memory \ --property-name=tenant_id --type=stringpnpm wrangler vectorize create-metadata-index agent-platform-lt-memory \ --property-name=agent_id --type=string
# 5. Queues (event bus)pnpm wrangler queues create human-reviewpnpm wrangler queues create shopify-actions
# 6. KV namespace (jobs index)pnpm wrangler kv namespace create JOBS_INDEX# Note the id; paste it into wrangler.toml under [[kv_namespaces]] -> idThe Durable Object class (AgentJob) doesn’t need creation
— it’s declared in code and bound by wrangler.toml’s
[[durable_objects.bindings]] entry.
3. Set the secrets
Section titled “3. Set the secrets”The Worker needs four secrets (one per external API key, plus the auth token).
# Generate the auth token first (you'll need it for the demo)WORKER_AUTH_TOKEN=$(openssl rand -hex 32)echo "Save this for later: $WORKER_AUTH_TOKEN"
# Set the secrets on the deployed Workerecho "$WORKER_AUTH_TOKEN" | pnpm wrangler secret put WORKER_AUTH_TOKENecho "your-anthropic-key" | pnpm wrangler secret put ANTHROPIC_API_KEYecho "your-openai-key" | pnpm wrangler secret put OPENAI_API_KEYecho "your-shopify-token" | pnpm wrangler secret put SHOPIFY_ACCESS_TOKENThere’s also one plain config var (not a secret): the
Shopify shop subdomain. Set it in wrangler.toml:
[vars]SHOPIFY_SHOP_DOMAIN = "your-shop-name" # the subdomain only, # not "your-shop-name.myshopify.com"4. Deploy
Section titled “4. Deploy”pnpm wrangler deployWrangler bundles the Worker (incl. the bundled YAML agents per ADR-0033), uploads it, and prints the deployment URL. Copy that URL.
WORKER_URL=https://<your-worker>.<your-account>.workers.devA first sanity check:
curl "$WORKER_URL/health"# Expected: {"status":"ok","ts":"..."}5. Run the end-to-end demo
Section titled “5. Run the end-to-end demo”The repo ships an e2e-demo.sh script that exercises the full
order-triage flow. It’s gated behind RUN_E2E=1 to prevent
accidental cost-incurring runs in CI.
RUN_E2E=1 \WORKER_URL="$WORKER_URL" \WORKER_AUTH_TOKEN="$WORKER_AUTH_TOKEN" \./scripts/e2e-demo.shThe script runs four steps:
| Step | What it does | What it asserts |
|---|---|---|
| 1. health | GET /health (no auth) | Worker is reachable; returns status: "ok" |
| 2. seed memory | POST /admin/seed-memory | 10 refund-history fixtures inserted into Vectorize + D1; 0 failures |
| 3. refund_decision direct | POST /run refund_decision for a known customer | The agent calls recall_memory; the recall returns the seeded fixtures; the agent’s summary mentions at least 5 of 8 expected signals (customer email, prior refunds, dates, decision keywords) |
| 4. triage full chain | POST /run triage with a customer email | The full delegation chain runs; the report returns status: "completed" with a non-empty summary |
If all four steps pass, you’ve just verified:
- The Worker handles HTTP and authentication
- The agent loader bundled the YAMLs correctly
- The agent runtime executes a turn with tool calls
- The Anthropic adapter makes real LLM calls and parses real responses
- The OpenAI adapter embeds memory queries correctly
- Vectorize stores and queries vectors with metadata filters
- D1 persists and hydrates memory rows
- The delegation tool synthesis works
- The recursive runtime invocation works for the triage → refund_decision → communication chain
- Event emission to the queues works (the script doesn’t assert the consumer; it asserts the emit succeeded)
That’s roughly the entire Phase 1 surface, exercised in 30-60 seconds.
What if it fails
Section titled “What if it fails”The script’s output is structured per-step with green checkmarks and red error blocks. Common failures:
| Symptom | Likely cause |
|---|---|
Step 1 fails: 403 Forbidden | Worker is deployed but /health is in your custom path matchers — check wrangler.toml |
Step 2 fails: failed to embed | OPENAI_API_KEY secret is missing or invalid |
Step 2 fails: vectorize: index not found | The Vectorize index wasn’t created or has a different name |
Step 3 fails: recall returned 0 matches | Step 2 succeeded but the metadata indexes weren’t created — see step 2 commands above |
Step 4 fails: delegation tool not found | The agent YAMLs weren’t bundled — check that apps/worker/agents/*.yaml is being included by Wrangler |
Any step fails: invalid bearer token | WORKER_AUTH_TOKEN env var doesn’t match the deployed secret |
For deeper debugging, tail the Worker’s logs:
pnpm wrangler tailEvery agent turn logs structured JSON: agent_name, task_id,
iteration, tool_calls, cost, duration_ms. Search by
task_id to follow a single request through the chain.
Beyond the demo
Section titled “Beyond the demo”After the e2e demo passes, three things you can try:
Trigger the merchandising cron manually — exercises the single-agent, no-delegation, no-memory path:
curl -X POST "$WORKER_URL/admin/run-merchandising" \ -H "Authorization: Bearer $WORKER_AUTH_TOKEN"Submit an async job — exercises the Durable Object path:
curl -X POST "$WORKER_URL/jobs" \ -H "Authorization: Bearer $WORKER_AUTH_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "agent_name": "triage", "instructions": "Customer email: lost shipping label", "payload": {"email": "test@example.com"} }'# Returns { "job_id": "..." }
# Poll for completioncurl "$WORKER_URL/jobs/<job_id>" \ -H "Authorization: Bearer $WORKER_AUTH_TOKEN"Inspect the seeded memory directly — read what the agent will recall:
curl "$WORKER_URL/admin/list-memory?agent_id=agent-refund-decision&limit=5" \ -H "Authorization: Bearer $WORKER_AUTH_TOKEN"Each of these exercises a different platform feature: cron triggers, async job execution with DO state, admin-mode memory read access.
Cleaning up
Section titled “Cleaning up”To remove the deployment when you’re done:
pnpm wrangler delete agent-platform-workerpnpm wrangler d1 delete agent-platformpnpm wrangler vectorize delete agent-platform-lt-memorypnpm wrangler queues delete human-reviewpnpm wrangler queues delete shopify-actionspnpm wrangler kv namespace delete --namespace-id <kv-id>The Cloudflare side leaves no residual cost. The deleted Worker
- resources are gone within a minute.
Operator notes
Section titled “Operator notes”A few things worth knowing if you’ll run this regularly:
The demo is self-contained. It seeds the memory it needs at step 2, so you can run it on a fresh deployment without any prep. Re-running step 2 is idempotent (the same fixtures with the same IDs overwrite cleanly).
Embedding costs scale with seed size. Step 2 embeds 10 entries (~$0.0001). A real customer-success or large-store deployment might seed thousands; that’s still pennies, but worth budgeting if you’re seeding tens of thousands.
The Worker is region-free. Cloudflare runs it at every edge location automatically. There’s no “eu-west-1” decision to make. Latency is roughly: ~10ms Worker overhead + ~3-10s LLM call + ~50-200ms Vectorize/D1 per memory op.
Logs persist for 7 days on Cloudflare’s free tier. Enough
for development; production deployments wire wrangler tail
output to a log sink (Logpush, an HTTPS endpoint, etc.).
Where to next
Section titled “Where to next”If you got the demo running:
- Order Triage scenario — the full walkthrough of what just executed
- Architecture — what each Cloudflare resource is doing and why
- Testing — how this e2e demo fits with the other test layers
If you want to build on top:
- YAML agent definitions — how to add a new agent
- B2B SaaS hypothetical — what a different vertical looks like; what you’d ship to build it
If something didn’t work or you have feedback, the repo is github.com/cellmanguney/agent-platform.