Cloudflare Queues
Cloudflare’s managed message queue service. Used as the platform’s event bus — when an agent decides something needs to happen (escalate to a human, refund an order), it doesn’t act directly; it emits an event onto a queue, and a separate consumer picks it up.
This separation is the platform’s central safety property. The agent recommends; the consumer acts. A human approval gate fits naturally between them.
What we use it for
Section titled “What we use it for”Two queues:
human-review— events that need a human eye. Today the consumer logs them; Phase 2 wires a real review UI.shopify-actions— Shopify mutations the platform wants to make (refund, cancel, annotate). Today the consumer logs them; Phase 2 executes them.
Each topic has a typed Zod schema (in packages/event-bus/src/topics/)
that the producer side validates against before publishing. The
consumer side re-validates on receipt. Mismatches bounce to the
consumer’s dead-letter handler.
A single Worker is both producer (via the queue binding’s send)
and consumer (via the queue handler). One deploy handles both
sides.
Why we picked it
Section titled “Why we picked it”The choice was: how do agents and consumers communicate asynchronously?
| Option | Verdict |
|---|---|
| Cloudflare Queues | Chosen. Native Worker binding for both producer and consumer; per-message ack/retry semantics; included in Workers Paid. |
| AWS SQS / EventBridge | Adds REST API + auth tokens + a separate billing surface. Loses the binding-only ergonomics. |
| RabbitMQ self-hosted | We’d run a cluster. Wrong scale for a platform-as-a-service play. |
| In-process events / direct calls | Defeats the safety property. The agent and the action would share a transaction; a buggy agent could cause real-world side effects. |
| Cloudflare Workflows | Strong fit for orchestrated workflows but heavier than what we need; queues are right-sized. |
Queues won because they make the producer and consumer fully decoupled (different deploy units in principle; same Worker in practice today) without adding any external service. See ADR-0032 for the full async-coordination decision.
What it costs
Section titled “What it costs”Cloudflare Queues free tier (included with Workers Paid):
- 1M operations per month (publish + consume each count)
Phase 1’s demo emits at most 1-2 events per agent run. Tens of thousands of runs per month stay in free tier.
After free tier: $0.40 per million operations.
What it replaces
Section titled “What it replaces”A dedicated message broker (RabbitMQ, AWS SQS, GCP Pub/Sub) with
its own auth, network hop, monitoring, and billing. Queues
reduces this to a [[queues.producers]] and [[queues.consumers]]
declaration in wrangler.toml.
Where to look
Section titled “Where to look”packages/event-bus/— theEventBusinterface plus the Cloudflare Queues implementation and a Mock for testsapps/worker/src/queue-consumer.ts— the consumer entry point that reads off both queuesapps/worker/wrangler.toml— the[[queues.producers]]and[[queues.consumers]]blocks
Trade-offs we accepted
Section titled “Trade-offs we accepted”- At-least-once delivery, not exactly-once. A message
might be delivered to the consumer more than once if the
consumer fails after processing but before acking. Idempotency
is the consumer’s responsibility — Phase 2’s Shopify mutations
will need an idempotency key per
event.id. Tracked as part of the Phase 2 design. - No DLQ + retry policy yet. Failed messages today just bubble up; Phase 1’s consumer is logs-only so there’s nothing to retry. Phase 2 wires the DLQ — tracked as follow-up #10.
- Consumer batching. Each invocation receives a batch of
messages; the consumer must process all of them and ack
individually. Already wired in
apps/worker/src/queue-consumer.ts.