Demo

How to deploy the platform yourself, seed it with fixtures, and verify the full order-triage flow runs against real Cloudflare infrastructure (D1, Vectorize, Queues, Durable Objects) plus real Anthropic, OpenAI, and Shopify APIs.

This is the page that turns “the platform exists” into “I just ran it.” Total cost: ~$0.05 per end-to-end run. Total wall time: ~30-60 seconds.

What you’ll see

After the deploy and the four-step verification, you’ll have:

A Worker deployed at <your-worker>.workers.dev answering HTTP requests
10 refund-history fixtures seeded into long-term memory (Vectorize + D1)
A successful refund_decision agent run that recalls the fixtures and reasons over them
A successful triage agent run that delegates the full chain — triage → refund_decision → communication — and returns a structured AgentReport

The script asserts the expected shape at every step. It exits non-zero on regression, which makes it usable as a smoke test on every deploy.

What you’ll need

Before starting:

A Cloudflare account — free tier is fine for the demo; Workers + D1 + Vectorize + Queues + Durable Objects all have generous free quotas
An Anthropic API key — the agents use Claude Sonnet 4.5 (main) and Haiku 4.5 (sub-agents). Demo cost: ~$0.04 per run.
An OpenAI API key — embeddings only (text-embedding-3-small). Demo cost: ~$0.001 per run.
A Shopify Admin access token — read-only scopes are enough. The demo script uses a single test order lookup. If you don’t have a Shopify store, you can stub the tool; most of the demo runs without it.
Local toolchain: Node 22+, pnpm 10+, wrangler CLI, jq, and curl

The demo, in five sections

1. Clone and install

git clone https://github.com/cellmanguney/agent-platform.git
cd agent-platform
pnpm install

This installs all 16 packages and 2 apps. First install takes ~1 minute; subsequent installs are fast (pnpm is content-addressed).

2. Provision Cloudflare resources

The platform’s Worker needs five Cloudflare bindings: a D1 database, a Vectorize index, two Queues, a KV namespace, and a Durable Object class. The wrangler.toml declares them; you create them once.

cd apps/worker

# 1. D1 database (SQL — long-term memory rows + jobs)
pnpm wrangler d1 create agent-platform
# Note the database_id; paste it into wrangler.toml under
# [[d1_databases]] -> database_id

# 2. Apply the schema
pnpm wrangler d1 execute agent-platform --file=./schema/long_term_memory.sql --remote

# 3. Vectorize index (semantic search over memory)
pnpm wrangler vectorize create agent-platform-lt-memory \
  --dimensions=1536 \
  --metric=cosine

# 4. Vectorize metadata indexes (for tenant_id and agent_id filtering)
pnpm wrangler vectorize create-metadata-index agent-platform-lt-memory \
  --property-name=tenant_id --type=string
pnpm wrangler vectorize create-metadata-index agent-platform-lt-memory \
  --property-name=agent_id --type=string

# 5. Queues (event bus)
pnpm wrangler queues create human-review
pnpm wrangler queues create shopify-actions

# 6. KV namespace (jobs index)
pnpm wrangler kv namespace create JOBS_INDEX
# Note the id; paste it into wrangler.toml under [[kv_namespaces]] -> id

The Durable Object class (AgentJob) doesn’t need creation — it’s declared in code and bound by wrangler.toml’s [[durable_objects.bindings]] entry.

3. Set the secrets

The Worker needs four secrets (one per external API key, plus the auth token).

# Generate the auth token first (you'll need it for the demo)
WORKER_AUTH_TOKEN=$(openssl rand -hex 32)
echo "Save this for later: $WORKER_AUTH_TOKEN"

# Set the secrets on the deployed Worker
echo "$WORKER_AUTH_TOKEN" | pnpm wrangler secret put WORKER_AUTH_TOKEN
echo "your-anthropic-key" | pnpm wrangler secret put ANTHROPIC_API_KEY
echo "your-openai-key"    | pnpm wrangler secret put OPENAI_API_KEY
echo "your-shopify-token" | pnpm wrangler secret put SHOPIFY_ACCESS_TOKEN

There’s also one plain config var (not a secret): the Shopify shop subdomain. Set it in wrangler.toml:

[vars]
SHOPIFY_SHOP_DOMAIN = "your-shop-name"  # the subdomain only,
                                        # not "your-shop-name.myshopify.com"

4. Deploy

pnpm wrangler deploy

Wrangler bundles the Worker (incl. the bundled YAML agents per ADR-0033), uploads it, and prints the deployment URL. Copy that URL.

WORKER_URL=https://<your-worker>.<your-account>.workers.dev

A first sanity check:

curl "$WORKER_URL/health"
# Expected: {"status":"ok","ts":"..."}

5. Run the end-to-end demo

The repo ships an e2e-demo.sh script that exercises the full order-triage flow. It’s gated behind RUN_E2E=1 to prevent accidental cost-incurring runs in CI.

RUN_E2E=1 \
WORKER_URL="$WORKER_URL" \
WORKER_AUTH_TOKEN="$WORKER_AUTH_TOKEN" \
./scripts/e2e-demo.sh

The script runs four steps:

Step	What it does	What it asserts
1. health	`GET /health` (no auth)	Worker is reachable; returns `status: "ok"`
2. seed memory	`POST /admin/seed-memory`	10 refund-history fixtures inserted into Vectorize + D1; 0 failures
3. refund_decision direct	`POST /run refund_decision` for a known customer	The agent calls `recall_memory`; the recall returns the seeded fixtures; the agent’s summary mentions at least 5 of 8 expected signals (customer email, prior refunds, dates, decision keywords)
4. triage full chain	`POST /run triage` with a customer email	The full delegation chain runs; the report returns `status: "completed"` with a non-empty summary

If all four steps pass, you’ve just verified:

The Worker handles HTTP and authentication
The agent loader bundled the YAMLs correctly
The agent runtime executes a turn with tool calls
The Anthropic adapter makes real LLM calls and parses real responses
The OpenAI adapter embeds memory queries correctly
Vectorize stores and queries vectors with metadata filters
D1 persists and hydrates memory rows
The delegation tool synthesis works
The recursive runtime invocation works for the triage → refund_decision → communication chain
Event emission to the queues works (the script doesn’t assert the consumer; it asserts the emit succeeded)

That’s roughly the entire Phase 1 surface, exercised in 30-60 seconds.

What if it fails

The script’s output is structured per-step with green checkmarks and red error blocks. Common failures:

Symptom	Likely cause
Step 1 fails: `403 Forbidden`	Worker is deployed but `/health` is in your custom path matchers — check `wrangler.toml`
Step 2 fails: `failed to embed`	`OPENAI_API_KEY` secret is missing or invalid
Step 2 fails: `vectorize: index not found`	The Vectorize index wasn’t created or has a different name
Step 3 fails: `recall returned 0 matches`	Step 2 succeeded but the metadata indexes weren’t created — see step 2 commands above
Step 4 fails: `delegation tool not found`	The agent YAMLs weren’t bundled — check that `apps/worker/agents/*.yaml` is being included by Wrangler
Any step fails: `invalid bearer token`	`WORKER_AUTH_TOKEN` env var doesn’t match the deployed secret

For deeper debugging, tail the Worker’s logs:

pnpm wrangler tail

Every agent turn logs structured JSON: agent_name, task_id, iteration, tool_calls, cost, duration_ms. Search by task_id to follow a single request through the chain.

Beyond the demo

After the e2e demo passes, three things you can try:

Trigger the merchandising cron manually — exercises the single-agent, no-delegation, no-memory path:

curl -X POST "$WORKER_URL/admin/run-merchandising" \
  -H "Authorization: Bearer $WORKER_AUTH_TOKEN"

Submit an async job — exercises the Durable Object path:

curl -X POST "$WORKER_URL/jobs" \
  -H "Authorization: Bearer $WORKER_AUTH_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "agent_name": "triage",
    "instructions": "Customer email: lost shipping label",
    "payload": {"email": "test@example.com"}
  }'
# Returns { "job_id": "..." }

# Poll for completion
curl "$WORKER_URL/jobs/<job_id>" \
  -H "Authorization: Bearer $WORKER_AUTH_TOKEN"

Inspect the seeded memory directly — read what the agent will recall:

curl "$WORKER_URL/admin/list-memory?agent_id=agent-refund-decision&limit=5" \
  -H "Authorization: Bearer $WORKER_AUTH_TOKEN"

Each of these exercises a different platform feature: cron triggers, async job execution with DO state, admin-mode memory read access.

Cleaning up

To remove the deployment when you’re done:

pnpm wrangler delete agent-platform-worker
pnpm wrangler d1 delete agent-platform
pnpm wrangler vectorize delete agent-platform-lt-memory
pnpm wrangler queues delete human-review
pnpm wrangler queues delete shopify-actions
pnpm wrangler kv namespace delete --namespace-id <kv-id>

The Cloudflare side leaves no residual cost. The deleted Worker

resources are gone within a minute.

Operator notes

A few things worth knowing if you’ll run this regularly:

The demo is self-contained. It seeds the memory it needs at step 2, so you can run it on a fresh deployment without any prep. Re-running step 2 is idempotent (the same fixtures with the same IDs overwrite cleanly).

Embedding costs scale with seed size. Step 2 embeds 10 entries (~$0.0001). A real customer-success or large-store deployment might seed thousands; that’s still pennies, but worth budgeting if you’re seeding tens of thousands.

The Worker is region-free. Cloudflare runs it at every edge location automatically. There’s no “eu-west-1” decision to make. Latency is roughly: ~10ms Worker overhead + ~3-10s LLM call + ~50-200ms Vectorize/D1 per memory op.

Logs persist for 7 days on Cloudflare’s free tier. Enough for development; production deployments wire wrangler tail output to a log sink (Logpush, an HTTPS endpoint, etc.).

Where to next

If you got the demo running:

Order Triage scenario — the full walkthrough of what just executed
Architecture — what each Cloudflare resource is doing and why
Testing — how this e2e demo fits with the other test layers

If you want to build on top:

YAML agent definitions — how to add a new agent
B2B SaaS hypothetical — what a different vertical looks like; what you’d ship to build it

If something didn’t work or you have feedback, the repo is github.com/cellmanguney/agent-platform.