STORY-F-021: agents schema migration + real persistence (sessions, messages, budget, audit)¶
Epic: Agent Infrastructure Priority: Must Have Story Points: 2 Status: Not Started Assigned To: Unassigned Created: 2026-04-18 Sprint: 4
User Story¶
As a developer building on top of the Phase 0 agent stack,
I want the agents Postgres schema materialised with real tables for sessions, messages, budget limits, and tool audit — and F-012/F-013/F-014's ETS / stub persistence layers swapped for real reads + writes,
so that F-017 Agent Chat survives BEAM restarts, budgets persist across deploys, and the AW-14 audit trail is durable from day one of Phase 1.
Description¶
Background¶
F-012 (finnest_agents Orchestrator + BudgetGuard), F-013 (AiProvider + PromptCache), and F-014 (MCP ToolRegistry + sample tool) all deliberately shipped with stub / ETS persistence because the agents Postgres schema wasn't yet scheduled. The Sprint 3 solutioning gate check (2026-04-18) surfaced this as tech-debt item TD1 — without a real schema landing in Phase 0, F-017's Agent Chat LiveView would ship against volatile memory and the AW-14 audit trail wouldn't survive a pod restart.
This story lands the schema and swaps the stubs. Sprint 4 has 2pt headroom (13/15 committed before this story); F-021 fits without reducing the rest of Sprint 4's scope.
Scope¶
In scope:
- Migration (
apps/finnest_core/priv/repo/migrations/20260508000001_create_agents_schema.exs) — reversible (CQ-11). Createsagentsschema + 4 tables: agents.sessions— id UUID, org_id UUID NOT NULL, user_id UUID NOT NULL, started_at TIMESTAMPTZ, ended_at TIMESTAMPTZ, last_active_at TIMESTAMPTZ, metadata JSONB DEFAULT '{}', inserted_at, updated_at, deleted_at. Indexes on (org_id, started_at DESC) and (user_id, last_active_at DESC).agents.messages— id UUID, session_id UUID FK → agents.sessions(id) ON DELETE CASCADE, org_id UUID NOT NULL, role VARCHAR (user | assistant | system | tool), content TEXT, tool_calls JSONB DEFAULT '[]', tokens_in INT, tokens_out INT, cache_stats JSONB DEFAULT '{}', cost_aud DECIMAL(12,4), correlation_id UUID, inserted_at. Indexes on (session_id, inserted_at) and (org_id, inserted_at DESC).agents.budget_limits— id UUID, org_id UUID UNIQUE NOT NULL, daily_aud DECIMAL(12,4), weekly_aud DECIMAL(12,4), monthly_aud DECIMAL(12,4), daily_spent_aud DECIMAL(12,4), weekly_spent_aud DECIMAL(12,4), monthly_spent_aud DECIMAL(12,4), daily_reset_at TIMESTAMPTZ, weekly_reset_at TIMESTAMPTZ, monthly_reset_at TIMESTAMPTZ, inserted_at, updated_at. One row per org.agents.tool_audit— id UUID, session_id UUID, org_id UUID NOT NULL, tool_name VARCHAR, category VARCHAR, input_hash BYTEA (BLAKE2b), output_hash BYTEA, duration_ms INT, cost_aud DECIMAL(12,4), correlation_id UUID, causation_id UUID, model VARCHAR, tokens_in INT, tokens_out INT, cache_read_tokens INT, cache_write_tokens INT, error_reason VARCHAR, inserted_at. Indexes on (org_id, inserted_at DESC), (tool_name, inserted_at DESC), (correlation_id).- Ecto schemas —
FinnestAgents.Session,FinnestAgents.Message,FinnestAgents.BudgetLimit,FinnestAgents.ToolAuditEntry. All inherit the F-008 tenant-scoping hook automatically via:org_id in __schema__(:fields)— no explicit allowlisting. - Context modules —
FinnestAgents.Sessions,FinnestAgents.Messages,FinnestAgents.Budgets,FinnestAgents.Audit. Standard CRUD + tenant-scoped queries.Budgets.record_spend/3becomes an atomic UPDATE onagents.budget_limitswithdaily_spent_aud = daily_spent_aud + amountsemantics (no read-modify-write race). - Swap stubs in F-012/F-013/F-014 code:
FinnestAgents.SessionGenServer — replace in-memory history flush stub withFinnestAgents.Messages.persist_async/1(Oban job queued viaFinnestCore.ObanWorkerfrom F-008; wraps withwith_tenant/2).FinnestAgents.BudgetGuard— replace ETS-only accumulator with ETS hot path + periodic flush toagents.budget_limits(every 60s or on clean shutdown). On boot, rehydrate ETS fromagents.budget_limits.FinnestAgents.MCP.ToolRegistry— replace ETS-stubtool_auditwith real INSERT toagents.tool_audit(still viacast/asyncso hot path stays non-blocking — acceptable drop-on-crash because the message itself ends up inagents.messages).FinnestAgents.AiProvider.AnthropicDirect— cost/cache stats now persist throughagents.tool_auditrather thanPromptCacheGenServer memory.- PromptCache rework — stays as an in-memory hit-rate aggregator (OP-03 telemetry); stops pretending to be durable. Persistence moved to
agents.tool_audit. Removes ~60 LOC of "will persist later" comments. - Boot-time rehydrate —
FinnestAgents.Supervisor.init/1order becomes: Repo → Vault → BudgetGuard (rehydrates from DB) → ToolRegistry → Orchestrator → AgentSupervisor. PromptCache stays late (no durable state). - Reset jobs — Oban cron for daily/weekly/monthly budget resets (
FinnestAgents.Workers.ResetBudgets). Idempotent: checks reset_at timestamp; no-ops if already reset this window.
Out of scope:
- Full retention policy (90-day commercial / 7-year IRAP purge) — F-020 ships the cron; this story just creates the tables.
- Event-store tie-in (writing
agent.*domain events) — F-016 + F-017 wire this. agents.memoriesL2/L3 memory tables — agents.md §Memory System defers L2/L3 to post-Phase-0; not blocking any Phase 0 story.- Retroactive backfill — no production data exists yet; the stub window (F-012 → F-021) is pre-launch, nothing to migrate.
Technical Notes¶
- Pattern match with F-007: same tenant-rooted schema pattern —
org_id NOT NULL,deleted_atwhere soft-delete makes sense (Sessions), hard-delete where it doesn't (Messages cascade from Sessions; BudgetLimit single-row-per-org). - Hash-not-plaintext for audit (Commandment #24 / agents.md §Audit Trail):
input_hash+output_hashare BLAKE2b(16) of the serialised input/output maps. Plaintext lives briefly inagents.messages.contentbut is redacted/aged-off per retention policy. - DECIMAL precision for cost_aud: DA-15 mandates
DECIMAL(12,4)for money;ex_moneylibrary already in the dep set from F-007 context; re-use here. - BudgetGuard ETS + flush pattern preserves the non-blocking hot path from F-012 — the DB is the authoritative store on reboot, ETS is the per-request cache. On crash-recovery the worst case is 60s of in-flight spend not yet persisted, which is acceptable (the LLM call itself logged to
agents.tool_auditon completion, so cost reconciliation is still auditable). - Keeps F-008 tenant hook honest — every table has
org_idin__schema__(:fields), soFinnestCore.Repo.prepare_query/3injectsWHERE org_id = ?automatically. Noskip_org_scope: truecalls required for any of this story's context functions. A stray cross-tenant query raises RuntimeError loud and early. - Module namespace (symmetric with F-016/F-017/F-019): modules live under
FinnestAgents.*/FinnestCore.*(flat). This story was authored post-Sprint-3-D4-resolution and is mostly already consistent; one slip (Finnest.Agents.Workers.ResetBudgets→FinnestAgents.Workers.ResetBudgets) fixed 2026-04-18 alongside the Sprint 4 gate-check resolution PR.
Dependencies¶
- Blocked by: STORY-F-012 (finnest_agents scaffold — schema modules reference
FinnestAgents.Sessionetc.), STORY-F-007 (Cloak vault + schema patterns), STORY-F-008 (tenant enforcement — story relies onprepare_queryinjecting org_id). - Blocks: STORY-F-017 (Agent Chat LiveView — requires durable
agents.messages), STORY-F-020 (production smoke — asserts restart-survival of session state).
Acceptance Criteria¶
- Migration creates
agentsschema + 4 tables with field shapes per §Scope;mix ecto.migrate+mix ecto.rollbackboth succeed (CQ-11) - Ecto schemas compile;
Repo.insert/1+Repo.all/1+Repo.delete/1round-trip on each ofSession,Message,BudgetLimit,ToolAuditEntry - Every context function passes through
FinnestCore.Repo.prepare_query/3with noskip_org_scope: truebypass (grep-guarded by an architecture test) -
FinnestAgents.Messages.persist_async/1queues an Oban job viaFinnestCore.ObanWorker; job executes in tenant scope and inserts the row; retry-on-failure withmax_attempts: 3 -
FinnestAgents.Budgets.record_spend/3performs an atomic UPDATE (single round-trip, no read-modify-write) -
BudgetGuardrehydrates fromagents.budget_limitson boot; in-memory ETS reflects DB state after startup -
BudgetGuardflushes in-memory spend to DB every 60s (configurable) + on clean supervisor shutdown -
FinnestAgents.Workers.ResetBudgetsidempotent — running twice in the same window produces the same final state (QJ-01) -
ToolRegistry.invoke/3writes a row toagents.tool_auditfor every successful call; error path writes a row witherror_reasonpopulated -
AiProvider.AnthropicDirect.generate/3writes prompt + response hashes (not plaintext) + token counts + cost toagents.tool_audit -
PromptCacheGenServer no longer claims persistence in its module doc; responsibilities narrowed to in-memory hit-rate aggregation - F-012/F-013/F-014 stub comments (
@TODO: persistence when schema lands) are all removed - Architecture test: every
agents.*schema hasorg_id; tenant enforcement fires; no allowlist entries added (Sessions/Messages/BudgetLimit/ToolAudit are not exceptions — they all carry tenant) -
mix format,credo --strict,dialyzer,mix boundaryall green
Testing Requirements¶
- Unit: each context module (CRUD + error paths)
- Unit:
BudgetGuardboot rehydrate; flush; reset window; atomic record_spend - Unit:
ResetBudgetsworker idempotency - Unit:
Messages.persist_asyncenqueues correct Oban job with tenant args - Integration: full session lifecycle — start → 3 user messages → MCP tool calls → end → simulate crash mid-session → rehydrate new GenServer → assert prior messages retrievable
- Integration: cross-tenant attempt — tenant A creates a session; tenant B tries to load it; expect RuntimeError from
prepare_query/ empty result set - Property: random message sequences always produce consistent token + cost totals between session and
tool_auditsum
References¶
../architecture/agents.md§Memory System (L1 schema), §Audit Trail, §Budget Management../architecture/data.md§Tenant enforcement (re-used pattern)../adrs/adr-003-F-three-tier-ai-agent-architecture.md../10-GUARDRAILS.mdAI-08, AW-13, AW-14, DA-11..DA-13, DA-15, QJ-01, OP-03../_bmad-output/solutioning-gate-check-sprint-3-2026-04-18.md§TD1 — story provenance