Cloudflare D1
Cloudflare’s edge-replicated SQLite. Used as the source-of-truth row store for long-term memory, paired with Vectorize for the search index.
What we use it for
Section titled “What we use it for”A single table in a single database (agent_platform):
CREATE TABLE long_term_memory ( id TEXT PRIMARY KEY, -- ULID tenant_id TEXT NOT NULL, -- structural tenant scope agent_id TEXT NOT NULL, -- which agent this memory belongs to content TEXT NOT NULL, -- the actual memory text metadata TEXT NOT NULL, -- JSON-encoded embedding_provider TEXT, -- e.g. 'openai' embedding_model TEXT, -- e.g. 'text-embedding-3-small' embedding_dimensions INTEGER, created_at TEXT NOT NULL);Every long-term memory entry exists as a row here. The vector
embedding for the same entry lives in Vectorize. An entry’s
id is the same in both stores — Vectorize’s metadata holds
just tenant_id and agent_id for filtering; the full content
is hydrated from D1 after the vector search returns the matching
IDs.
Why we picked it
Section titled “Why we picked it”The choice was: where do we keep the source-of-truth row data for long-term memory?
| Option | Verdict |
|---|---|
| Cloudflare D1 | Chosen. Native Worker binding (no auth/network hop), SQL semantics, free-tier covers Phase 1 and well into Phase 2, edge-replicated for read latency. |
| Vectorize alone (store content in vector metadata) | Vectorize metadata has size limits and isn’t designed as a primary store. Loses queryability. |
| External SQL (Postgres on Neon / Supabase / RDS) | Adds a network hop from the Worker, separate auth, separate billing. Loses the no-infra story. |
| Cloudflare KV | Key-value, no queries. Doesn’t fit a row store with metadata filters. |
| R2 (object storage) | Wrong shape entirely; would need a query layer on top. |
D1 won because it’s the closest fit to “I want a relational table available from the Worker with no setup.” See ADR-0032 for the full storage-primitives decision.
What it costs
Section titled “What it costs”D1 free tier (included with Workers Paid):
- 5 GB total storage
- 5M rows read per day
- 100K rows written per day
Phase 1’s seed inserts 10 rows. Each agent run reads at most a handful (the matches from a Vectorize search, hydrated from D1). Tens of thousands of agent runs per day stay within free tier comfortably.
After free tier: $0.75 per GB-month storage; $0.001 per 1K rows read; $1.00 per 1M rows written.
What it replaces
Section titled “What it replaces”A dedicated relational database — Postgres on Neon, RDS, or
similar — with its own connection pool, auth, monitoring, and
billing. D1 reduces this to a binding declaration in
wrangler.toml and a SQL migration file in apps/worker/migrations/.
Where to look
Section titled “Where to look”apps/worker/migrations/0001_long_term_memory.sql— the schema migration; only one migration todaypackages/memory/src/long-term-memory-storage.ts— the D1 access layer (typed prepared statements; no ORM)apps/worker/wrangler.toml— the[[d1_databases]]binding
Trade-offs we accepted
Section titled “Trade-offs we accepted”- No ORM. We use raw prepared statements. Drizzle was considered but adds a dependency for one table. Will revisit if the schema grows past ~5 tables.
- Edge replication is read-only. Writes go to a primary region (set at index creation; ours is EEUR per the deploy log). Read replicas in other regions catch up within milliseconds to seconds. For a memory subsystem, eventual consistency on reads is fine — a recall query that misses a just-written entry simply doesn’t surface it on this turn.
- No delete-by-metadata. Like Vectorize, D1 doesn’t have a
declarative metadata index for fast deletes. We use plain
DELETE WHERE tenant_id = ?, which scans ontenant_id. The tenant_id column is indexed; this is fine for tens of thousands of rows. Will need re-architecture at multi-million- row scale, which is far away.