Skip to content

STORY-F-012: finnest_agents — Orchestrator + AgentSupervisor + BudgetGuard

Epic: Agent Infrastructure Priority: Must Have Story Points: 3 Status: Not Started Assigned To: Unassigned Created: 2026-04-17 Sprint: 3


User Story

As a user (eventually) and developer (now), I want the agent infrastructure scaffold live — Orchestrator routing intents, a DynamicSupervisor spawning per-session agents with restart: :temporary, and a BudgetGuard circuit breaker gating AI calls by per-org spend, so that Sprint 4's Agent Chat UI has a working backend and the three-tier agent pattern (ADR-003-F) is structurally real.


Description

Background

ADR-003-F commits to three-tier agents (Conversational / Workflow / Autonomous). ADR-004-F commits to MCP at every domain boundary. This story scaffolds the Tier 1 Conversational infrastructure — Orchestrator, AgentSupervisor, BudgetGuard. Actual conversational behaviour + MCP tool wiring lands in F-013 (AiProvider) and F-014 (MCP Tool behaviour). F-017 in Sprint 4 adds the LiveView UI.

This is infrastructure, not a feature. No user-visible change yet.

Scope

In scope:

  • FinnestAgents.Orchestrator — singleton GenServer:
  • route/2 — takes {intent :: String.t(), context :: %{org_id, user_id, session_id, correlation_id}} and returns {:pattern, tool_call} OR {:llm, classified_intent} OR {:error, reason}
  • Tier 1 (pattern match): stubs return :no_match for now — real patterns added in F-014 sample tool
  • Tier 2 (LLM fallback): stubbed to return {:no_match, :llm_not_wired_yet} — wired in F-013 when AiProvider lands
  • start_session/2 — spawns a new conversational agent GenServer via FinnestAgents.AgentSupervisor.start_child/1
  • end_session/1 — terminates a session GenServer cleanly
  • FinnestAgents.AgentSupervisorDynamicSupervisor, strategy: :one_for_one, restart: :temporary
  • FinnestAgents.Session — GenServer holding conversation state %{session_id, org_id, user_id, messages: [], started_at}
  • handle_cast({:user_message, text}) — routes via Orchestrator
  • Stores messages async to agents.messages table (schema not yet landed — stub persistence module that no-ops for now; real persistence when agents schema lands)
  • Idle timeout: 10 min of no messages → graceful shutdown (persist state, then exit)
  • FinnestAgents.BudgetGuard — GenServer per-org budget circuit breaker:
  • check/1 — takes org_id; returns :ok, {:warning, pct} (≥80%), or {:error, :budget_exceeded} (≥100%)
  • record_spend/3 — cast (non-blocking) to accumulate spend for org; checks thresholds; emits events at 50/80/95/100% thresholds (PRD E4.7 AC3)
  • Persists budget state to agents.budget_limits table (stub if schema not yet landed; full persistence when agents schema lands)
  • Budget limits sourced from org settings or config defaults
  • FinnestAgents.ToolRegistry — GenServer (F-014 adds real content; this story adds the empty registry + boot-time discovery scaffold)
  • Supervision tree: FinnestAgents.Supervisorone_for_one — starts Orchestrator, AgentSupervisor, ClaudeClient (placeholder — F-013), ToolRegistry, BudgetGuard
  • Per AI-03/AI-04: the org_id injection pattern is implemented as part of session context; agents can never override
  • Architecture test agent_supervisor_restart_test.exs: spawn a session, exit(pid, :kill) → confirm supervisor does NOT restart (it's :temporary); session is gone

Out of scope:

  • Actual Claude API calls (F-013 AiProvider)
  • Actual MCP tools (F-014)
  • Agent Chat LiveView UI (F-017)
  • Tier 2 Workflow agents (Onboarding pipeline, Pay run — Migration Phase 1+)
  • Tier 3 Autonomous agents (Compliance Monitor etc. — Migration Phase 1+)
  • Agent memory L2/L3 persistence (defer to post-Phase-0)
  • Prompt cache GenServer (F-013 adds; observability only in F-020)

Technical Notes

  • Module namespace: FinnestAgents.* (flat), not Finnest.Agents.* (dotted). Boundary 0.10.x cannot classify Finnest.Agents.* under a FinnestAgents boundary block — the same constraint F-003 hit for FinnestCore.* and F-006/F-007 carried forward. The finnest_agents OTP app's top-level module is FinnestAgents; all submodules nest under it.
  • Orchestrator is a singleton (name: __MODULE__). At scale, multiple nodes each have their own; PubSub can coordinate if needed.
  • AgentSupervisor restart: :temporary per architecture §Supervision tree — agents die cleanly when sessions end; no auto-restart with stale context
  • Session GenServer state held in memory during process lifetime; persisted to DB when process exits (or on checkpoint every N messages — defer)
  • BudgetGuard handle_cast/2 for record_spend is async — hot path not blocked. Threshold check happens after accumulate. Use ETS for fast per-org lookup.
  • Correlation IDs propagated into session context and to any subsequent LLM/MCP call
  • Max 10 events per correlation chain (AI-05) — Orchestrator tracks chain depth via :causation_id and refuses to fire further events beyond the limit
  • Idle timeout implementation: Process.send_after/3 with :timeout message; reset on each user message

Dependencies

  • Blocked by: STORY-F-003 (Tenant + CorrelationId)

Acceptance Criteria

  • finnest_agents app compiles; Application.ensure_all_started(:finnest_agents) returns {:ok, _}
  • FinnestAgents.Supervisor starts Orchestrator + AgentSupervisor + ToolRegistry + BudgetGuard (ClaudeClient placeholder — F-013 replaces)
  • FinnestAgents.Orchestrator.route/2 returns {:error, :no_match} for unknown intents (graceful fallback until F-013/F-014 wire real routing)
  • FinnestAgents.Orchestrator.start_session/2 spawns a new Session GenServer; end_session/1 terminates it
  • Session GenServer handles :user_message cast and routes via Orchestrator
  • Session GenServer idle timeout: after 10 min of no messages, gracefully terminates; supervisor does NOT restart (:temporary)
  • BudgetGuard.check(org_id) returns :ok when under limit; {:warning, 80..99} at 80-99%; {:error, :budget_exceeded} at ≥100%
  • BudgetGuard.record_spend/3 is non-blocking (cast); updates accumulated spend; emits threshold events
  • Architecture test: kill a session GenServer → AgentSupervisor does NOT restart (restart: :temporary verified)
  • Architecture test: spawn 100 concurrent session GenServers → all 100 alive; memory budget <150 MB (1.3 KB base per GenServer × 100 = 130 KB, plus message queue + state)
  • Correlation ID propagates from Orchestrator.route → Session.handle_cast → (future) LLM/MCP call
  • Max-10 chain limit: attempt 11th causation event → Orchestrator refuses with :chain_limit_exceeded
  • mix format, credo --strict, dialyzer, mix boundary all green

Testing Requirements

  • Unit: Orchestrator.route happy path + error paths
  • Unit: Session GenServer full lifecycle (start → message → idle timeout → terminate)
  • Unit: BudgetGuard threshold behaviour (parameterised for 50/80/95/100%)
  • Integration: start agent session, send 5 messages, verify routing, verify correlation propagation
  • Chaos: 100 concurrent sessions; random kills; supervisor stays alive; no restart of dead sessions
  • Property: random message sequences never leak memory (run for 1000 iterations)

References