STORY-F-021: `agents` schema migration + real persistence (sessions, messages, budget, audit)¶

Epic: Agent Infrastructure Priority: Must Have Story Points: 2 Status: Not Started Assigned To: Unassigned Created: 2026-04-18 Sprint: 4

User Story¶

As a developer building on top of the Phase 0 agent stack, I want the agents Postgres schema materialised with real tables for sessions, messages, budget limits, and tool audit — and F-012/F-013/F-014's ETS / stub persistence layers swapped for real reads + writes, so that F-017 Agent Chat survives BEAM restarts, budgets persist across deploys, and the AW-14 audit trail is durable from day one of Phase 1.

Description¶

Background¶

F-012 (finnest_agents Orchestrator + BudgetGuard), F-013 (AiProvider + PromptCache), and F-014 (MCP ToolRegistry + sample tool) all deliberately shipped with stub / ETS persistence because the agents Postgres schema wasn't yet scheduled. The Sprint 3 solutioning gate check (2026-04-18) surfaced this as tech-debt item TD1 — without a real schema landing in Phase 0, F-017's Agent Chat LiveView would ship against volatile memory and the AW-14 audit trail wouldn't survive a pod restart.

This story lands the schema and swaps the stubs. Sprint 4 has 2pt headroom (13/15 committed before this story); F-021 fits without reducing the rest of Sprint 4's scope.

Scope¶

In scope:

Migration (apps/finnest_core/priv/repo/migrations/20260508000001_create_agents_schema.exs) — reversible (CQ-11). Creates agents schema + 4 tables:
agents.sessions — id UUID, org_id UUID NOT NULL, user_id UUID NOT NULL, started_at TIMESTAMPTZ, ended_at TIMESTAMPTZ, last_active_at TIMESTAMPTZ, metadata JSONB DEFAULT '{}', inserted_at, updated_at, deleted_at. Indexes on (org_id, started_at DESC) and (user_id, last_active_at DESC).
agents.messages — id UUID, session_id UUID FK → agents.sessions(id) ON DELETE CASCADE, org_id UUID NOT NULL, role VARCHAR (user | assistant | system | tool), content TEXT, tool_calls JSONB DEFAULT '[]', tokens_in INT, tokens_out INT, cache_stats JSONB DEFAULT '{}', cost_aud DECIMAL(12,4), correlation_id UUID, inserted_at. Indexes on (session_id, inserted_at) and (org_id, inserted_at DESC).
agents.budget_limits — id UUID, org_id UUID UNIQUE NOT NULL, daily_aud DECIMAL(12,4), weekly_aud DECIMAL(12,4), monthly_aud DECIMAL(12,4), daily_spent_aud DECIMAL(12,4), weekly_spent_aud DECIMAL(12,4), monthly_spent_aud DECIMAL(12,4), daily_reset_at TIMESTAMPTZ, weekly_reset_at TIMESTAMPTZ, monthly_reset_at TIMESTAMPTZ, inserted_at, updated_at. One row per org.
agents.tool_audit — id UUID, session_id UUID, org_id UUID NOT NULL, tool_name VARCHAR, category VARCHAR, input_hash BYTEA (BLAKE2b), output_hash BYTEA, duration_ms INT, cost_aud DECIMAL(12,4), correlation_id UUID, causation_id UUID, model VARCHAR, tokens_in INT, tokens_out INT, cache_read_tokens INT, cache_write_tokens INT, error_reason VARCHAR, inserted_at. Indexes on (org_id, inserted_at DESC), (tool_name, inserted_at DESC), (correlation_id).
Ecto schemas — FinnestAgents.Session, FinnestAgents.Message, FinnestAgents.BudgetLimit, FinnestAgents.ToolAuditEntry. All inherit the F-008 tenant-scoping hook automatically via :org_id in __schema__(:fields) — no explicit allowlisting.
Context modules — FinnestAgents.Sessions, FinnestAgents.Messages, FinnestAgents.Budgets, FinnestAgents.Audit. Standard CRUD + tenant-scoped queries. Budgets.record_spend/3 becomes an atomic UPDATE on agents.budget_limits with daily_spent_aud = daily_spent_aud + amount semantics (no read-modify-write race).
Swap stubs in F-012/F-013/F-014 code:
FinnestAgents.Session GenServer — replace in-memory history flush stub with FinnestAgents.Messages.persist_async/1 (Oban job queued via FinnestCore.ObanWorker from F-008; wraps with with_tenant/2).
FinnestAgents.BudgetGuard — replace ETS-only accumulator with ETS hot path + periodic flush to agents.budget_limits (every 60s or on clean shutdown). On boot, rehydrate ETS from agents.budget_limits.
FinnestAgents.MCP.ToolRegistry — replace ETS-stub tool_audit with real INSERT to agents.tool_audit (still via cast/async so hot path stays non-blocking — acceptable drop-on-crash because the message itself ends up in agents.messages).
FinnestAgents.AiProvider.AnthropicDirect — cost/cache stats now persist through agents.tool_audit rather than PromptCache GenServer memory.
PromptCache rework — stays as an in-memory hit-rate aggregator (OP-03 telemetry); stops pretending to be durable. Persistence moved to agents.tool_audit. Removes ~60 LOC of "will persist later" comments.
Boot-time rehydrate — FinnestAgents.Supervisor.init/1 order becomes: Repo → Vault → BudgetGuard (rehydrates from DB) → ToolRegistry → Orchestrator → AgentSupervisor. PromptCache stays late (no durable state).
Reset jobs — Oban cron for daily/weekly/monthly budget resets (FinnestAgents.Workers.ResetBudgets). Idempotent: checks reset_at timestamp; no-ops if already reset this window.

Out of scope:

Full retention policy (90-day commercial / 7-year IRAP purge) — F-020 ships the cron; this story just creates the tables.
Event-store tie-in (writing agent.* domain events) — F-016 + F-017 wire this.
agents.memories L2/L3 memory tables — agents.md §Memory System defers L2/L3 to post-Phase-0; not blocking any Phase 0 story.
Retroactive backfill — no production data exists yet; the stub window (F-012 → F-021) is pre-launch, nothing to migrate.

Technical Notes¶

Pattern match with F-007: same tenant-rooted schema pattern — org_id NOT NULL, deleted_at where soft-delete makes sense (Sessions), hard-delete where it doesn't (Messages cascade from Sessions; BudgetLimit single-row-per-org).
Hash-not-plaintext for audit (Commandment #24 / agents.md §Audit Trail): input_hash + output_hash are BLAKE2b(16) of the serialised input/output maps. Plaintext lives briefly in agents.messages.content but is redacted/aged-off per retention policy.
DECIMAL precision for cost_aud: DA-15 mandates DECIMAL(12,4) for money; ex_money library already in the dep set from F-007 context; re-use here.
BudgetGuard ETS + flush pattern preserves the non-blocking hot path from F-012 — the DB is the authoritative store on reboot, ETS is the per-request cache. On crash-recovery the worst case is 60s of in-flight spend not yet persisted, which is acceptable (the LLM call itself logged to agents.tool_audit on completion, so cost reconciliation is still auditable).
Keeps F-008 tenant hook honest — every table has org_id in __schema__(:fields), so FinnestCore.Repo.prepare_query/3 injects WHERE org_id = ? automatically. No skip_org_scope: true calls required for any of this story's context functions. A stray cross-tenant query raises RuntimeError loud and early.
Module namespace (symmetric with F-016/F-017/F-019): modules live under FinnestAgents.* / FinnestCore.* (flat). This story was authored post-Sprint-3-D4-resolution and is mostly already consistent; one slip (Finnest.Agents.Workers.ResetBudgets → FinnestAgents.Workers.ResetBudgets) fixed 2026-04-18 alongside the Sprint 4 gate-check resolution PR.

Dependencies¶

Blocked by: STORY-F-012 (finnest_agents scaffold — schema modules reference FinnestAgents.Session etc.), STORY-F-007 (Cloak vault + schema patterns), STORY-F-008 (tenant enforcement — story relies on prepare_query injecting org_id).
Blocks: STORY-F-017 (Agent Chat LiveView — requires durable agents.messages), STORY-F-020 (production smoke — asserts restart-survival of session state).

Acceptance Criteria¶

Testing Requirements¶

Unit: each context module (CRUD + error paths)
Unit: BudgetGuard boot rehydrate; flush; reset window; atomic record_spend
Unit: ResetBudgets worker idempotency
Unit: Messages.persist_async enqueues correct Oban job with tenant args
Integration: full session lifecycle — start → 3 user messages → MCP tool calls → end → simulate crash mid-session → rehydrate new GenServer → assert prior messages retrievable
Integration: cross-tenant attempt — tenant A creates a session; tenant B tries to load it; expect RuntimeError from prepare_query / empty result set
Property: random message sequences always produce consistent token + cost totals between session and tool_audit sum

References¶

../architecture/agents.md §Memory System (L1 schema), §Audit Trail, §Budget Management
../architecture/data.md §Tenant enforcement (re-used pattern)
../adrs/adr-003-F-three-tier-ai-agent-architecture.md
../10-GUARDRAILS.md AI-08, AW-13, AW-14, DA-11..DA-13, DA-15, QJ-01, OP-03
../_bmad-output/solutioning-gate-check-sprint-3-2026-04-18.md §TD1 — story provenance

STORY-F-021: agents schema migration + real persistence (sessions, messages, budget, audit)¶