Skip to content

ADR-003-F: Three-Tier AI Agent Architecture

Status: Accepted Date: 2026-04-16 Decision Makers: Gautham Chellappa Depends on: ADR-001-F (Elixir), ADR-002-F (Modular Monolith), ADR-004-F (MCP)

Context

Finnest is AI-native — agents are architecture, not a feature. They handle three distinct classes of problems with different durability, cost, and oversight characteristics:

  1. User-facing conversation (short-lived, interactive, needs low latency)
  2. Multi-step business processes (hours-to-weeks, needs checkpointing and retry)
  3. Always-on monitoring (scheduled, cost-bounded, read-only-by-design)

Trying to serve all three with a single agent pattern either overbuilds for conversation (paying Oban overhead for chat) or underbuilds for workflows (losing state on node restart). Three distinct patterns, each tuned to its problem, is correct.

Decision

Finnest has three tiers of AI agents, each with different durability and cost profiles:

Tier 1 — Conversational (GenServer, short-lived)

  • One GenServer process per user session (via Finnest.Agents.AgentSupervisorDynamicSupervisor, restart: :temporary)
  • Holds conversation state in process memory
  • Dies when session ends; hydrates from agents.sessions on reconnect
  • Cost profile: variable (pattern-match $0, Claude $0.01–0.05)
  • Types: User Chat Agent, Reach Agent, Admin Assistant

Tier 2 — Workflow (Oban, medium-lived)

  • Oban jobs chaining multi-step business processes
  • Checkpoints to onboard.pipeline_steps (or per-domain equivalent) between steps
  • Survives node restarts; distributes across cluster
  • Cost profile: higher per invocation but bounded per outcome
  • Types: Onboarding Pipeline, Super Onboarding Wizard, Scoring & Matching, Pay Run Processing, Incident Response

Tier 3 — Autonomous (Oban cron, long-lived)

  • Always-on cron-triggered jobs
  • PROPOSE-only — READ, NOTIFY, REPORT allowed; WRITE, UPDATE, DELETE forbidden (AI-06)
  • Types: Compliance Monitor, Anomaly Detector, Roster Optimiser, Data Quality Agent

All three tiers interact with domains via typed MCP tools (ADR-004-F). org_id injected by MCP framework — agents cannot provide it (AI-03). Per-org budget circuit breaker prevents cost explosion (AI-08).

Alternatives Considered

Alternative Rejected because
Single-tier agents (everything is a GenServer) Workflows lose state on restart; autonomous jobs can't schedule reliably
Single-tier agents (everything is an Oban job) Conversational latency unacceptable (job pickup delay); session state must round-trip DB
External agent framework (LangChain / LlamaIndex) Brings Python dependency; misaligned with BEAM supervision; duplicates Oban
Microservice agent tier (separate service) Violates ADR-002-F modular-monolith principle; adds ops cost for no benefit at scale

See ../brainstorms/brainstorm-03-ai-agent-design.md and ../architecture/agents.md for full design detail.

Consequences

Positive:

  • Each tier tuned to its problem — conversation is fast, workflows are resilient, autonomous is cost-bounded
  • Clear mental model: "is this interactive? → Tier 1; business process? → Tier 2; monitoring? → Tier 3"
  • Local-first routing in Orchestrator catches 70%+ of queries before Claude (AI-01)
  • Per-org cost budget with circuit breaker contains blast radius (AI-08)
  • Tier 3 PROPOSE-only rule prevents autonomous destructive actions — human stays in the loop

Negative:

  • Three patterns to learn and maintain
  • Orchestrator routing logic must correctly classify which tier handles which intent
  • Mental overhead: developers must choose the right tier when adding a new agent

Tipping points for re-evaluation:

  • Routing overhead becomes measurable cost (unlikely — classification is O(ms))
  • A fourth tier emerges (e.g. on-device agents) — then revisit

Relationship to Guardrails

Enforces / is enforced by: AI-01 through AI-08 (8 AI-specific guardrails), AW-12 (confidence framework), AW-13 (budget limits), AW-14 (agent audit trail).