STORY-F-012: `finnest_agents` — Orchestrator + AgentSupervisor + BudgetGuard¶

Epic: Agent Infrastructure Priority: Must Have Story Points: 3 Status: Not Started Assigned To: Unassigned Created: 2026-04-17 Sprint: 3

User Story¶

As a user (eventually) and developer (now), I want the agent infrastructure scaffold live — Orchestrator routing intents, a DynamicSupervisor spawning per-session agents with restart: :temporary, and a BudgetGuard circuit breaker gating AI calls by per-org spend, so that Sprint 4's Agent Chat UI has a working backend and the three-tier agent pattern (ADR-003-F) is structurally real.

Description¶

Background¶

ADR-003-F commits to three-tier agents (Conversational / Workflow / Autonomous). ADR-004-F commits to MCP at every domain boundary. This story scaffolds the Tier 1 Conversational infrastructure — Orchestrator, AgentSupervisor, BudgetGuard. Actual conversational behaviour + MCP tool wiring lands in F-013 (AiProvider) and F-014 (MCP Tool behaviour). F-017 in Sprint 4 adds the LiveView UI.

This is infrastructure, not a feature. No user-visible change yet.

Scope¶

In scope:

FinnestAgents.Orchestrator — singleton GenServer:
route/2 — takes {intent :: String.t(), context :: %{org_id, user_id, session_id, correlation_id}} and returns {:pattern, tool_call} OR {:llm, classified_intent} OR {:error, reason}
Tier 1 (pattern match): stubs return :no_match for now — real patterns added in F-014 sample tool
Tier 2 (LLM fallback): stubbed to return {:no_match, :llm_not_wired_yet} — wired in F-013 when AiProvider lands
start_session/2 — spawns a new conversational agent GenServer via FinnestAgents.AgentSupervisor.start_child/1
end_session/1 — terminates a session GenServer cleanly
FinnestAgents.AgentSupervisor — DynamicSupervisor, strategy: :one_for_one, restart: :temporary
FinnestAgents.Session — GenServer holding conversation state %{session_id, org_id, user_id, messages: [], started_at}
handle_cast({:user_message, text}) — routes via Orchestrator
Stores messages async to agents.messages table (schema not yet landed — stub persistence module that no-ops for now; real persistence when agents schema lands)
Idle timeout: 10 min of no messages → graceful shutdown (persist state, then exit)
FinnestAgents.BudgetGuard — GenServer per-org budget circuit breaker:
check/1 — takes org_id; returns :ok, {:warning, pct} (≥80%), or {:error, :budget_exceeded} (≥100%)
record_spend/3 — cast (non-blocking) to accumulate spend for org; checks thresholds; emits events at 50/80/95/100% thresholds (PRD E4.7 AC3)
Persists budget state to agents.budget_limits table (stub if schema not yet landed; full persistence when agents schema lands)
Budget limits sourced from org settings or config defaults
FinnestAgents.ToolRegistry — GenServer (F-014 adds real content; this story adds the empty registry + boot-time discovery scaffold)
Supervision tree: FinnestAgents.Supervisor — one_for_one — starts Orchestrator, AgentSupervisor, ClaudeClient (placeholder — F-013), ToolRegistry, BudgetGuard
Per AI-03/AI-04: the org_id injection pattern is implemented as part of session context; agents can never override
Architecture test agent_supervisor_restart_test.exs: spawn a session, exit(pid, :kill) → confirm supervisor does NOT restart (it's :temporary); session is gone

Out of scope:

Actual Claude API calls (F-013 AiProvider)
Actual MCP tools (F-014)
Agent Chat LiveView UI (F-017)
Tier 2 Workflow agents (Onboarding pipeline, Pay run — Migration Phase 1+)
Tier 3 Autonomous agents (Compliance Monitor etc. — Migration Phase 1+)
Agent memory L2/L3 persistence (defer to post-Phase-0)
Prompt cache GenServer (F-013 adds; observability only in F-020)

Technical Notes¶

Module namespace: FinnestAgents.* (flat), not Finnest.Agents.* (dotted). Boundary 0.10.x cannot classify Finnest.Agents.* under a FinnestAgents boundary block — the same constraint F-003 hit for FinnestCore.* and F-006/F-007 carried forward. The finnest_agents OTP app's top-level module is FinnestAgents; all submodules nest under it.
Orchestrator is a singleton (name: __MODULE__). At scale, multiple nodes each have their own; PubSub can coordinate if needed.
AgentSupervisor restart: :temporary per architecture §Supervision tree — agents die cleanly when sessions end; no auto-restart with stale context
Session GenServer state held in memory during process lifetime; persisted to DB when process exits (or on checkpoint every N messages — defer)
BudgetGuard handle_cast/2 for record_spend is async — hot path not blocked. Threshold check happens after accumulate. Use ETS for fast per-org lookup.
Correlation IDs propagated into session context and to any subsequent LLM/MCP call
Max 10 events per correlation chain (AI-05) — Orchestrator tracks chain depth via :causation_id and refuses to fire further events beyond the limit
Idle timeout implementation: Process.send_after/3 with :timeout message; reset on each user message

Dependencies¶

Blocked by: STORY-F-003 (Tenant + CorrelationId)

Acceptance Criteria¶

Testing Requirements¶

Unit: Orchestrator.route happy path + error paths
Unit: Session GenServer full lifecycle (start → message → idle timeout → terminate)
Unit: BudgetGuard threshold behaviour (parameterised for 50/80/95/100%)
Integration: start agent session, send 5 messages, verify routing, verify correlation propagation
Chaos: 100 concurrent sessions; random kills; supervisor stays alive; no restart of dead sessions
Property: random message sequences never leak memory (run for 1000 iterations)

References¶

../architecture/agents.md §Three Tiers, §Infrastructure, §Governance
../adrs/adr-003-F-three-tier-ai-agent-architecture.md
../10-GUARDRAILS.md AI-01, AI-03, AI-04, AI-05, AI-06, AI-08
../brainstorms/brainstorm-03-ai-agent-design.md

STORY-F-012: finnest_agents — Orchestrator + AgentSupervisor + BudgetGuard¶