Skip to content

ADR-0023: Tool registry — concrete `ToolResolver`

ADR-0023: Tool registry — concrete ToolResolver

Section titled “ADR-0023: Tool registry — concrete ToolResolver”

Status: Accepted Date: 2026-04-23

The agent runtime (ADR-0021) depends on the ToolResolver interface from @agent-platform/core, not a concrete registry. Tests inject mocks. Production has nothing yet.

With delegation shipped (ADR-0022), the platform can execute multi-agent work end-to-end — but callers wiring up an agent still have no prebuilt component that says “here’s the set of tools this agent can call.” Every consumer would have to either roll their own resolver or inline the map. That’s a recipe for inconsistency on what is a security-adjacent surface: a misconfigured resolver that silently shadows a tool, ignores autonomy rules, or accepts unknown tool names in ways that differ from one deployment to another.

This ADR commits the first concrete ToolResolver as its own package. Small by design — the real question is not “how big is the registry” but “what invariants does it enforce on its consumers.”

Ship @agent-platform/tools with a single class ToolRegistry (factory variant createToolRegistry), implementing ToolResolver.

Tools are passed to the constructor; the resulting registry exposes resolve(name), list(), and size but has no register() or unregister() method at runtime.

Why: this is a security property. A compromised or buggy code path at runtime cannot add a tool that bypasses autonomy boundaries, because the set of tools is closed before any request runs. To change the tool set, the operator redeploys with a different constructor argument. The benefit — “tools can be added dynamically based on user input” — is a feature we deliberately don’t want.

Alternatives considered:

  • Mutable registry with register(). Flexible but opens a real attack surface. Rejected on security grounds.
  • Freeze after a registration phase. Common pattern (registry.seal()). Rejected as ceremony: the constructor-takes-an-array shape is simpler and makes the invariant unmissable at construction sites.
  • Tool loading from a manifest file. Premature; YAML loader ships in a later session and will compose over this registry, not replace it.

Two tools sharing the same name throws ConfigError during construction, with the duplicated name in context.tool_name.

Why: a configuration bug should fail at startup, not silently shadow a tool at runtime. The scenario this prevents: a team member adds a send_email tool, not realizing another module already registered one under the same name; one of them silently wins. That’s a bug that surfaces as “my tool doesn’t run” days later with no error trail. Fatal-at-construction turns it into a deploy-time failure with a clear error.

Alternatives considered:

  • Last-write-wins. Easiest to implement, worst to debug. Rejected.
  • First-write-wins. Same problem, opposite direction. Rejected.
  • Warn and pick one. Warnings are invisible in production logs. Rejected.

resolve("Get_Weather") returns null even if get_weather is registered. Tool names are an exact API surface.

Why: tools appear in two places — the model’s visible tool list, and the registry lookup — and they must match exactly for the tool loop to work. Fuzzy matching would create confusing behavior: the model might call getWeather, the registry resolves it to get_weather, but the model’s next tool_use block uses the original casing, creating an id mismatch or a second unexpected lookup path. Exact matching is the only option that preserves the invariant “the name the model sees is the name the registry answers to.”

Today, tool handlers receive unknown. The registry does not parse tool input against the tool’s input_schema before calling the handler. Each tool is responsible for validating its own inputs.

Why defer: input validation is a real ADR on its own. The Tool.input_schema field is currently Readonly<Record<string, unknown>> (a JSON Schema) — the natural validator is a JSON Schema library like ajv. Alternatively, we could change input_schema to accept a Zod schema and auto-derive JSON Schema via z.toJSONSchema for the model-facing representation. Both options have tradeoffs. Rushing the decision into this session risks a worse design than doing it with intent in a dedicated ADR.

What this means in practice: tools defending themselves is fine for the first real apps. The weather tool in apps/example demonstrates the pattern — a handful of lines checking that input is an object and the required field is a string. Not ideal but not dangerous. A misbehaving tool that blindly trusts its input fails in its own handler, not in the registry.

Tracking: new “Tool input validation” entry in open-questions.md. Expected to land within the next 3-4 sessions, likely alongside the first real Business Pack tools.

Returns a sorted array of tool names. Intended for debug logs, dashboards, and test assertions. The runtime never calls it on the hot path.

Why not hide it: introspection is legitimate for deployments that want to show operators which tools an agent has access to. Debugging “why didn’t the agent use tool X” requires seeing what was registered. Making this observable beats making consumers reach into private fields.

Sorted: deterministic output for tests and diff-friendly logs. Not a contract consumers should depend on for ordering, but useful in practice.

  • First concrete ToolResolver exists. The first real app can ship.
  • Security-adjacent invariants are enforced uniformly. Every consumer using @agent-platform/tools gets the same immutability and duplicate-detection guarantees. No “every team rolls their own resolver” drift.
  • Input validation deferred. Tools that trust their input can misbehave. Mitigated by: (a) handlers in apps/example demonstrate the defensive pattern, (b) the runtime wraps tool exceptions as soft tool_result errors (ADR-0021) so a bad input at worst fails one iteration, not the whole turn, (c) new tracked open question.
  • No hot-reload. Operators changing the tool set must redeploy. For Worker deploys this is seconds; not a real cost.
  • New package: packages/tools with ToolRegistry + createToolRegistry + 12 tests + README.
  • New app: apps/example — research assistant CLI consuming the registry end-to-end. First thing in the repo that actually runs against the live Anthropic API. See the app’s README for usage.
  • Tests: 14 new across the registry and the example’s smoke test. Workspace total: 376 passed + 2 skipped.
  • Skip the package; inline the map in consumers. Would work, but drifts toward inconsistency as more consumers ship. The registry is ~50 lines of real code plus 100 lines of tests — the abstraction earns its place by making the invariants uniform.
  • Hybrid mutable/immutable with freeze(). Two-phase construction is a common pattern in registries (Spring, ASP.NET). Rejected as ceremony: there’s no good reason in our use case to have a mutable phase; it only exists to support hypothetical future use cases we don’t want.
  • JSON Schema validation at registry level with ajv. Considered; rejected as a session-9 scope addition. Ships with its own ADR later.
  • Zod-native Tool.input_schema. Considered; would require changing Tool in core and deriving JSON Schema for the model via z.toJSONSchema. Real option but requires deliberation about whether the provider-agnostic LLM adapter’s ModelTool.input_schema should also change (it expects whatever the provider wants, which is JSON Schema for Anthropic). Deferred to the input-validation ADR.
  • Expose register() gated by a “development mode” flag. Unconditional immutability is simpler and hasn’t cost us anything. If a real use case for dev-mode dynamic registration emerges, revisit then.
  1. Input validation ADR + implementation. Likely the session after YAML loader — by then the example app has more than one tool and the case for a uniform validator is stronger.
  2. Tool manifest loader. The YAML agent definition loader (next session) will reference tool names in agent YAML; a tool manifest YAML could co-exist. Alternatively, tools stay code-only forever (a defensible choice — tools are executable, code is the right medium). Decide alongside the YAML ADR.