Skip to content

STORY-F-014: MCP Tool behaviour + ToolRegistry + sample tool

Epic: Agent Infrastructure Priority: Must Have Story Points: 3 Status: Not Started Assigned To: Unassigned Created: 2026-04-17 Sprint: 3


User Story

As a developer adding agent tools later, I want the MCP Tool behaviour + ToolRegistry infrastructure working with at least one end-to-end sample tool, so that ADR-004-F "MCP at every domain boundary" is structurally real — every subsequent domain just implements more tools using the gold-standard pattern.


Description

Background

ADR-004-F commits to MCP (Model Context Protocol) as the contract between agents and domain OTP apps. Inside one BEAM node, MCP tool invocation is a direct function call for µs latency (Part 6 decision); the protocol shape (typed schemas, category metadata, context injection) is preserved so domain extraction in Phase 3+ is a transport swap, not a rewrite.

This story lands the behaviour + registry + one sample tool (finnest_core.get_current_user) that can be called via the Orchestrator end-to-end.

Scope

In scope:

  • FinnestAgents.MCP.Tool behaviour/macro:
  • use FinnestAgents.MCP.Tool, name: "roster_list_shifts", domain: :roster, category: :read, description: "..."
  • input :field_name, :type, required: true, description: "..." macro for declaring input fields
  • output_schema %{...} macro for declaring output shape
  • def call(params, context) callback — implementer provides
  • Compile-time generation of: typed input struct, JSON schema for Anthropic tool description, registry entry
  • FinnestAgents.MCP.Server — per-domain server module that registers its tools into the ToolRegistry on boot
  • FinnestAgents.ToolRegistry GenServer (flesh out F-012 skeleton):
  • On boot, discovers all MCP Tool modules via :code.all_loaded + behaviour matching
  • Indexes by: name, domain, category, per-role permission matrix
  • list/0, list_by_category/1, find/1 (by name), invoke/3 (by name + params + context)
  • invoke/3 is the critical path: validates params against input schema, injects org_id from context (AI-03), calls tool's call/2, logs to agents.tool_audit (stub persistence until agents schema lands)
  • Category enforcement (AW-12): :read → no restrictions; :propose → returns proposal, never writes; :execute → writes but must be human-initiated via session; :restricted → agent cannot invoke autonomously (requires explicit session action)
  • Sample tool: FinnestCore.MCP.Tools.GetCurrentUsername: "core_get_current_user", category: :read, returns the authenticated user (by session user_id); proves the end-to-end path
  • Pattern-match routing in Orchestrator (F-012) extended: "who am I" / "current user" / "me"core_get_current_user (Tier 1, cost $0)
  • @callback for every domain MCP server to export its tools — registered at app boot
  • Architecture test mcp_org_id_injection_test.exs: attempt to pass org_id as param to a tool → rejected (context-only per AI-03)
  • Architecture test mcp_tool_audit_test.exs: every invoke produces a tool_audit entry (stub persistence ok for now)

Out of scope:

  • JSON-RPC transport (Part 6 decision: defer — in-process function calls for Phase 0)
  • Persistence of agents.tool_audit to real DB table (agents schema lands in a later sprint with real migration; stub now)
  • Domain-specific MCP servers (land as each domain's sprints pick up)
  • Tool permission matrix per user role (scaffolded; full rules in Scout+Verify sprints)

Technical Notes

  • Module namespaces: FinnestAgents.MCP.* for behaviour/registry modules (flat under the finnest_agents OTP app); domain sample tool lives under its owning app's top-level module, so FinnestCore.MCP.Tools.GetCurrentUser (not Finnest.Core.MCP.Tools.*). Same Boundary rationale as F-003 and F-012: Boundary cannot classify dotted namespaces under a flat boundary block.
  • The use macro approach mimics Ecto's schema DSL — familiar to Elixir developers
  • Input validation: use Ecto.Changeset under the hood for robust coercion + error messages (stringly-typed LLM output needs rigorous input validation)
  • JSON schema generation: Anthropic expects {name, description, input_schema: {type: "object", properties: {...}, required: [...]}} — generate from the input macro declarations
  • invoke/3 signature: ToolRegistry.invoke(tool_name, params_map, context_map) :: {:ok, output} | {:error, reason}
  • context_map must contain org_id, user_id, session_id, correlation_id — asserted by guard
  • Sample tool call path: Orchestrator pattern match "who am I" → ToolRegistry.invoke("core_get_current_user", %{}, context) → returns {:ok, %{user_id: ..., name: ..., role: ...}} (from current session)
  • agents.tool_audit stub: use a GenServer with ETS backing; persist to DB when the real table exists (later sprint). Keep API stable so swap is trivial.
  • Boot-time tool discovery: call ToolRegistry.scan/0 from FinnestAgents.Supervisor.init/1 after all deps started; re-scan on hot code upgrade

Dependencies

  • Blocked by: STORY-F-012 (AgentSupervisor + Orchestrator + placeholder ToolRegistry)
  • Blocks: Sample tool execution path; Scout+Verify sprints' domain MCP implementations

Acceptance Criteria

  • FinnestAgents.MCP.Tool macro compiles a tool module cleanly
  • Deliberately broken tool (missing call/2) → compile error with clear message
  • ToolRegistry.scan/0 discovers the sample tool module on boot
  • ToolRegistry.list/0 returns a map containing at least {"core_get_current_user", %{...metadata...}}
  • ToolRegistry.invoke("core_get_current_user", %{}, context) returns {:ok, %{user_id: ..., ...}}
  • Attempt to invoke with missing context fields → {:error, :missing_context}
  • Attempt to pass org_id in params → rejected; context-only enforcement verified
  • Invalid tool name → {:error, :tool_not_found}
  • Invalid input params → {:error, {:invalid_input, [field: reason, ...]}}
  • Every successful invoke writes an entry to agents.tool_audit (ETS stub for now): tool_name, input_hash, output_hash, duration_ms, correlation_id
  • Orchestrator pattern match routes "who am I" → core_get_current_user tool → returns user info (end-to-end path works)
  • JSON schema generation: the sample tool's input_schema matches the expected Anthropic tool-definition shape (unit test asserts)
  • Architecture tests pass (org_id injection + audit entry creation)
  • mix format, credo --strict, dialyzer, mix boundary all green

Testing Requirements

  • Unit: Tool macro correctly generates input struct + JSON schema
  • Unit: ToolRegistry list/find/invoke happy paths + error paths
  • Unit: Category enforcement (call a :restricted tool via invoke/3 → rejected unless explicitly authorised)
  • Integration: Orchestrator → ToolRegistry.invoke → Sample tool → response — full chain
  • Property: random input maps → invoke/3 always returns {:ok, _} or {:error, _}; no raises

References