ADR-0019: Provider-agnostic LLM adapter interface
ADR-0019: Provider-agnostic LLM adapter interface
Section titled “ADR-0019: Provider-agnostic LLM adapter interface”Status: Accepted Date: 2026-04-21
Context
Section titled “Context”ADR-0008 committed the platform to ModelTier as the agent-facing abstraction — agents ask for 'critical' | 'main' | 'sub', never a specific model name. That ADR left the adapter shape itself undefined: what does a call look like, what does it return, how are errors structured, where does the tier-to-model mapping happen?
ADR-0013 bars 5 and 10 set hard requirements for anything that makes LLM calls:
- Bar 5: every LLM call produces a traceable structured record (agent id, task id, model, token counts, latency, cost, outcome).
- Bar 10:
TaskConstraints.time_budget_msandcost_budget_usdmust be enforced, not merely received.
With the foundation layer in place (errors, logger, config) and the LLM adapter being the next component to build, the interface shape can no longer be deferred. A component that doesn’t exist yet cannot cause problems; a component that exists with the wrong shape contaminates every downstream consumer.
This ADR commits the shape. The Anthropic concrete implementation is a separate ADR that will be written alongside the code.
Decision
Section titled “Decision”Ship two packages:
@agent-platform/llm— the interface, types, error classes, andMockAdapter.@agent-platform/llm-<provider>— one per concrete provider (-anthropic, future-openai, etc.).
ModelAdapter interface
Section titled “ModelAdapter interface”interface ModelAdapter { readonly provider: string; generate(request: ModelRequest): Promise<ModelResponse>;}One method. Non-streaming for now (see Alternatives). Throws a typed subclass of AgentPlatformError for every failure mode; never resolves with a partial or untyped error.
Request shape
Section titled “Request shape”interface ModelRequest { tier: ModelTier; // ADR-0008 system?: string; messages: readonly Message[]; tools?: readonly ModelTool[]; tool_choice?: ToolChoice; max_tokens: number; temperature?: number; stop_sequences?: readonly string[]; time_budget_ms?: number; // enforced, not forwarded cost_budget_usd?: number; // enforced, not forwarded abort_signal?: AbortSignal;}Message carries either a string (shorthand) or an array of ContentBlocks. Content-block discriminated union is text | tool_use | tool_result. The shape mirrors Anthropic’s Messages API wire format deliberately — it is also the cleanest common ground with OpenAI’s Responses API, so translation for non-Anthropic providers is field-renaming rather than reshaping.
Response shape
Section titled “Response shape”interface ModelResponse { model: string; // post tier resolution — for logs content: readonly ContentBlock[]; stop_reason: 'end_turn' | 'max_tokens' | 'stop_sequence' | 'tool_use'; usage: { input_tokens, output_tokens, cost_usd }; latency_ms: number;}Always complete. cost_usd is the adapter’s best estimate based on published pricing at call time; pricing changes are handled by redeploying the adapter with an updated internal table.
Error taxonomy — seven concrete classes
Section titled “Error taxonomy — seven concrete classes”Every LLM error extends AgentPlatformError (ADR-0017). Each class maps to a distinct caller action:
| Class | code | Default severity | Caller action |
|---|---|---|---|
LLMAuthError | LLM_AUTH_ERROR | fatal | Operator problem; no retry will help |
LLMRateLimitError | LLM_RATE_LIMIT | error | Retry with backoff; context.retry_after_ms when known |
LLMTimeoutError | LLM_TIMEOUT | error | Our budget fired; retry with larger budget or degrade |
LLMContextLengthError | LLM_CONTEXT_LENGTH | error | Retry only after shrinking input |
LLMUnavailableError | LLM_UNAVAILABLE | error | Retry with backoff; 5xx / network |
LLMInvalidRequestError | LLM_INVALID_REQUEST | error | Fix request before retrying; 4xx other than above |
LLMBudgetExceededError | LLM_BUDGET_EXCEEDED | warn | Pre-flight refusal; no provider call was made |
Platform budgets are enforced, not forwarded
Section titled “Platform budgets are enforced, not forwarded”time_budget_ms→ adapter creates anAbortControllerwith a scheduledabort()call and passes the signal to the SDK. On timeout the adapter throwsLLMTimeoutErrorwithelapsed_msandbudget_msin context.cost_budget_usd→ adapter computes a conservative pre-flight estimate (using its internal pricing table and the input-token length derived from a rough char-count heuristic, because accurate pre-call tokenization is not free of weight). If the estimate already exceeds budget, the adapter throwsLLMBudgetExceededErrorwithout making the network call.
Tier-to-model mapping lives inside each concrete adapter
Section titled “Tier-to-model mapping lives inside each concrete adapter”Each createAnthropicAdapter({ apiKey, modelMap }) / createOpenAIAdapter({ apiKey, modelMap }) takes its own modelMap: Record<ModelTier, string>. No central ModelRouter today; when runtime provider A/B ever matters, a thin wrapper-over-N-adapters becomes additive work.
Consequences
Section titled “Consequences”- ADR-0008 now has a concrete realization.
ModelTierflows throughModelRequest.tier; concrete adapters own the mapping. - Bar 5 is mechanizable. Every consumer can log from
response.latency_ms,response.model,response.usagewithout additional instrumentation. - Bar 10 is enforced at the single trust boundary where it matters. Budgets cannot be accidentally bypassed by forgetting to wire them through; the adapter either enforces or throws.
- Consumers write provider-agnostic code. A task-running function takes
ModelAdapter, notAnthropicAdapter. When a second provider lands, zero consumer-code changes. - Tests are offline by default.
MockAdaptercovers every consumer’s test surface. The Anthropic package will have its own integration tests gated behind an env flag, never pulling real API credit for unit test runs. - Seven error classes is more surface than a single
LLMError. Justified because each maps to a distinct caller action. A consumer that handlesLLMRateLimitErrorwith exponential-backoff-and-retry but handlesLLMContextLengthErrorwith trim-and-retry needs them as distinct classes, not asswitch (err.code)strings. The string-switch alternative is strictly worse ergonomics for the same information density. - Non-streaming only, today. Every consumer we have (and every consumer Phase 1 plans) generates one response per turn. When streaming is actually wanted, it is a separate method on
ModelAdapter(generateStream) returning an async iterable — not a reshape of the currentgenerate.
Consequences for the repo
Section titled “Consequences for the repo”- New workspace package:
packages/llm/. Depends on@agent-platform/core(forModelTier) and@agent-platform/errors. No external runtime dependencies. - 80 new tests: taxonomy conformance for 7 error classes +
MockAdapterbehavior. Workspace total 243. - Next session ships
packages/llm-anthropic/— depends on@anthropic-ai/sdk. Separate ADR for the concrete implementation.
Alternatives considered
Section titled “Alternatives considered”- One
LLMErrorclass with a stringcodefield. Three instead of seven classes, simpler taxonomy to add to. Rejected: consumers reach forswitch (err.code)and re-derive the same information; class-based dispatch is strictly richer for the same shape. The seven classes are the distinctions a caller cares about — collapsing them moves work from design time to every call site. - Streaming-first interface (
AsyncIterable<ModelResponseChunk>). Every provider supports streaming and a streaming interface can always be collected into a non-streaming one. Rejected because the non-streaming interface has a clearer error model (one throw or one resolution), and no current or planned consumer needs streaming. AddinggenerateStreamlater as a separate method is cleaner than having today’s consumers collect-to-complete an async iterable for no reason. - A central
ModelRoutertoday.createRouter({ anthropic, openai })returns something that dispatches per-tier / per-request. Rejected: we have one provider today. Building a router before the second provider exists encodes assumptions we don’t yet have. When the second provider lands, a router is a thin wrapper — not worth the ceremony now. - Use Vercel’s
aiSDK (@ai-sdk/anthropic,generateText, etc.). Widely used, covers streaming, multi-provider, tool use out of the box. Rejected because the AI SDK’s abstraction is opinionated in ways that don’t match our needs: its error model is flatter, it couples to React / Next.js idioms in spots, and adopting it means giving up the ADR-0013 bar 5 / bar 10 enforcement guarantees (budgets, per-call audit records) that we need to own at the adapter boundary. When we’ve stabilized enough to know we won’t need to wedge enforcement in at a deeper level, we can reevaluate — but that’s a retrofit worth ~5 days of work, not a 20-minute port. - Single package (
@agent-platform/llmwith an Anthropic adapter inside). Smaller graph. Rejected because test-setup consumers pull in the Anthropic SDK transitively whether they use it or not. A two-package split keeps unit tests free of external SDK code and makes future provider additions symmetric (each is its own package, not a fork-in-a-shared-package). - Keep the content-block shape fully generic (
content: string). Simpler, loses tool-use support. Rejected because every realistic agent turn hits tool use and a fully-generic shape would force every caller to round-trip through a typed layer on their own. The Anthropic-aligned shape loses nothing today and lets tool-using agents work natively. - Model-registry lookup via a separate
ModelRegistryservice. Decouples tier-mapping from adapter construction. Rejected: adapter-owns-its-map is simpler, each adapter instance is already provider-specific, and the registry pattern earns its ceremony only when we have enough adapters to need centralized configuration. Today we have none.