STORY-F-012: finnest_agents — Orchestrator + AgentSupervisor + BudgetGuard¶
Epic: Agent Infrastructure Priority: Must Have Story Points: 3 Status: Not Started Assigned To: Unassigned Created: 2026-04-17 Sprint: 3
User Story¶
As a user (eventually) and developer (now),
I want the agent infrastructure scaffold live — Orchestrator routing intents, a DynamicSupervisor spawning per-session agents with restart: :temporary, and a BudgetGuard circuit breaker gating AI calls by per-org spend,
so that Sprint 4's Agent Chat UI has a working backend and the three-tier agent pattern (ADR-003-F) is structurally real.
Description¶
Background¶
ADR-003-F commits to three-tier agents (Conversational / Workflow / Autonomous). ADR-004-F commits to MCP at every domain boundary. This story scaffolds the Tier 1 Conversational infrastructure — Orchestrator, AgentSupervisor, BudgetGuard. Actual conversational behaviour + MCP tool wiring lands in F-013 (AiProvider) and F-014 (MCP Tool behaviour). F-017 in Sprint 4 adds the LiveView UI.
This is infrastructure, not a feature. No user-visible change yet.
Scope¶
In scope:
FinnestAgents.Orchestrator— singleton GenServer:route/2— takes{intent :: String.t(), context :: %{org_id, user_id, session_id, correlation_id}}and returns{:pattern, tool_call}OR{:llm, classified_intent}OR{:error, reason}- Tier 1 (pattern match): stubs return
:no_matchfor now — real patterns added in F-014 sample tool - Tier 2 (LLM fallback): stubbed to return
{:no_match, :llm_not_wired_yet}— wired in F-013 when AiProvider lands start_session/2— spawns a new conversational agent GenServer viaFinnestAgents.AgentSupervisor.start_child/1end_session/1— terminates a session GenServer cleanlyFinnestAgents.AgentSupervisor—DynamicSupervisor, strategy: :one_for_one, restart: :temporaryFinnestAgents.Session— GenServer holding conversation state%{session_id, org_id, user_id, messages: [], started_at}handle_cast({:user_message, text})— routes via Orchestrator- Stores messages async to
agents.messagestable (schema not yet landed — stub persistence module that no-ops for now; real persistence when agents schema lands) - Idle timeout: 10 min of no messages → graceful shutdown (persist state, then exit)
FinnestAgents.BudgetGuard— GenServer per-org budget circuit breaker:check/1— takesorg_id; returns:ok,{:warning, pct}(≥80%), or{:error, :budget_exceeded}(≥100%)record_spend/3— cast (non-blocking) to accumulate spend for org; checks thresholds; emits events at 50/80/95/100% thresholds (PRD E4.7 AC3)- Persists budget state to
agents.budget_limitstable (stub if schema not yet landed; full persistence when agents schema lands) - Budget limits sourced from org settings or config defaults
FinnestAgents.ToolRegistry— GenServer (F-014 adds real content; this story adds the empty registry + boot-time discovery scaffold)- Supervision tree:
FinnestAgents.Supervisor—one_for_one— starts Orchestrator, AgentSupervisor, ClaudeClient (placeholder — F-013), ToolRegistry, BudgetGuard - Per AI-03/AI-04: the
org_idinjection pattern is implemented as part of session context; agents can never override - Architecture test
agent_supervisor_restart_test.exs: spawn a session,exit(pid, :kill)→ confirm supervisor does NOT restart (it's:temporary); session is gone
Out of scope:
- Actual Claude API calls (F-013 AiProvider)
- Actual MCP tools (F-014)
- Agent Chat LiveView UI (F-017)
- Tier 2 Workflow agents (Onboarding pipeline, Pay run — Migration Phase 1+)
- Tier 3 Autonomous agents (Compliance Monitor etc. — Migration Phase 1+)
- Agent memory L2/L3 persistence (defer to post-Phase-0)
- Prompt cache GenServer (F-013 adds; observability only in F-020)
Technical Notes¶
- Module namespace:
FinnestAgents.*(flat), notFinnest.Agents.*(dotted). Boundary 0.10.x cannot classifyFinnest.Agents.*under aFinnestAgentsboundary block — the same constraint F-003 hit forFinnestCore.*and F-006/F-007 carried forward. Thefinnest_agentsOTP app's top-level module isFinnestAgents; all submodules nest under it. - Orchestrator is a singleton (
name: __MODULE__). At scale, multiple nodes each have their own; PubSub can coordinate if needed. - AgentSupervisor
restart: :temporaryper architecture §Supervision tree — agents die cleanly when sessions end; no auto-restart with stale context - Session GenServer state held in memory during process lifetime; persisted to DB when process exits (or on checkpoint every N messages — defer)
- BudgetGuard handle_cast/2 for record_spend is async — hot path not blocked. Threshold check happens after accumulate. Use ETS for fast per-org lookup.
- Correlation IDs propagated into session context and to any subsequent LLM/MCP call
- Max 10 events per correlation chain (AI-05) — Orchestrator tracks chain depth via
:causation_idand refuses to fire further events beyond the limit - Idle timeout implementation:
Process.send_after/3with:timeoutmessage; reset on each user message
Dependencies¶
- Blocked by: STORY-F-003 (Tenant + CorrelationId)
Acceptance Criteria¶
-
finnest_agentsapp compiles;Application.ensure_all_started(:finnest_agents)returns{:ok, _} -
FinnestAgents.Supervisorstarts Orchestrator + AgentSupervisor + ToolRegistry + BudgetGuard (ClaudeClient placeholder — F-013 replaces) -
FinnestAgents.Orchestrator.route/2returns{:error, :no_match}for unknown intents (graceful fallback until F-013/F-014 wire real routing) -
FinnestAgents.Orchestrator.start_session/2spawns a new Session GenServer;end_session/1terminates it - Session GenServer handles
:user_messagecast and routes via Orchestrator - Session GenServer idle timeout: after 10 min of no messages, gracefully terminates; supervisor does NOT restart (
:temporary) -
BudgetGuard.check(org_id)returns:okwhen under limit;{:warning, 80..99}at 80-99%;{:error, :budget_exceeded}at ≥100% -
BudgetGuard.record_spend/3is non-blocking (cast); updates accumulated spend; emits threshold events - Architecture test: kill a session GenServer → AgentSupervisor does NOT restart (
restart: :temporaryverified) - Architecture test: spawn 100 concurrent session GenServers → all 100 alive; memory budget <150 MB (1.3 KB base per GenServer × 100 = 130 KB, plus message queue + state)
- Correlation ID propagates from Orchestrator.route → Session.handle_cast → (future) LLM/MCP call
- Max-10 chain limit: attempt 11th causation event → Orchestrator refuses with
:chain_limit_exceeded -
mix format,credo --strict,dialyzer,mix boundaryall green
Testing Requirements¶
- Unit: Orchestrator.route happy path + error paths
- Unit: Session GenServer full lifecycle (start → message → idle timeout → terminate)
- Unit: BudgetGuard threshold behaviour (parameterised for 50/80/95/100%)
- Integration: start agent session, send 5 messages, verify routing, verify correlation propagation
- Chaos: 100 concurrent sessions; random kills; supervisor stays alive; no restart of dead sessions
- Property: random message sequences never leak memory (run for 1000 iterations)
References¶
../architecture/agents.md§Three Tiers, §Infrastructure, §Governance../adrs/adr-003-F-three-tier-ai-agent-architecture.md../10-GUARDRAILS.mdAI-01, AI-03, AI-04, AI-05, AI-06, AI-08../brainstorms/brainstorm-03-ai-agent-design.md