PRD: Scout + Verify Go-Live on Finnest¶
Date: 2026-04-16
Author: Gautham Chellappa (gchellappa@fortigrid.com)
Version: 1.0
Status: Draft
Delivery phase: Scout + Verify Go-Live (weeks 5–12 of the 44-week roadmap, post Phase 0)
Format: Finnest-lean — focuses on user stories + acceptance criteria + success metrics + cutover criteria. References architecture/architecture.md, 10-GUARDRAILS.md, brainstorms, and ADRs for content already decided elsewhere.
1. Context & Goal¶
Scout (recruitment) and Verify (document verification) are the two most production-hardened domains in the AgenticAI-app Laravel codebase — 46+ sprints, 1,220 tests, 52 schemas in the Elixir PoC already. They are also the two domains with the clearest path to user-visible value for Ashley Services Group (ASG).
Go-live strategy (per brainstorm-11 Topic 2 and ADR-001-F): build Scout + Verify directly on Finnest/Elixir, skip Laravel deployment entirely, go live for ASG in ~8 weeks. This is not a rewrite — the PoC is 70% there (52 schemas, 327 tests). It is production-hardening of an existing implementation while migrating execution from Laravel to Elixir.
What goes live:
- Scout recruitment surface — job orders, candidate pool refresh, deterministic scoring, assessment system (native — replaces Typeform), three-track pipeline, outreach via SMSGlobal + AWS SES, placement flow, Admin Central write-back.
- Verify document verification pipeline — four AI agents (classify, extract, compare faces, cross-reference), five-stage pipeline, manual review queue, OTP verification, identity checklist, per-org per-stage AI budget with circuit breaker, Admin Central write-back.
- Agent chat as primary UX (new — Laravel has no equivalent).
- Command bar (Cmd+K) navigation + actions (new).
- MCP servers for Scout and Verify (new protocol contract per ADR-004-F).
What does NOT go live in this phase: anything outside Scout + Verify domains. See §8 Out of Scope.
Why this order: Scout and Verify are the two domains that have been battle-tested in Laravel. Shipping them first on Finnest proves the Elixir stack in production, generates revenue (AgenticAI credits), and creates the data-in handoff pattern that the subsequent migration phases (Recruitment → Onboarding → Roster+CMS → Timesheet+Reporting → Mobile+Pact) depend on.
Decision gate¶
Week 2 of Phase 0 is the commit point. If external MySQL connections (admin_central, admin_atslive via MyXQL) + KeyPay API validation + deployment pipeline are all working, commit to Elixir Direct. If blockers emerge, fall back to shipping Laravel AgenticAI-app as Plan B safety net (this PRD does not describe that fallback).
2. Strategic Context¶
Summary references — full content elsewhere:
| Topic | Reference |
|---|---|
| Technology stack + rationale | architecture.md Part 3; ADR-001-F |
| Architecture (supervised modular monolith) | architecture.md Parts 2, 4; ADR-002-F |
| AI agent architecture (three-tier) | architecture.md Part 4; agents.md; ADR-003-F |
| MCP contracts | agents.md; ADR-004-F |
| Event-driven cross-domain | ADR-005-F |
| AI data residency (Bedrock Sydney for Verify) | ADR-0010 |
| Hexagonal ports (AI provider, SMS, email) | architecture.md Part 3; ADR-006-F |
| NFRs (19 drivers) | architecture.md Part 7 |
| 42 Commandments + 177 guardrails | 42-COMMANDMENTS.md, 10-GUARDRAILS.md; ADR-013-F |
| Laravel scope being ported (de-facto spec) | AgenticAI-app/app/Scout/*, app/Verify/*; 84 feature tests |
| Migration order (after go-live) | brainstorm-11 Topic 3; ADR-010-F |
Every story in this PRD obeys the architectural invariants in architecture.md §Architectural Invariants (Repo.transaction, events cross-domain, org_id everywhere, ex_money for money, etc.). Stories do not re-state these invariants.
3. Personas¶
| # | Persona | Primary role | Scout? | Verify? | Key workflows |
|---|---|---|---|---|---|
| P1 | Recruitment Consultant | ASG consultant who owns job orders and places candidates | ✓ | reads | Create job orders, review pool, trigger scoring, send outreach, confirm placements |
| P2 | Onboarding Coordinator | ASG staff who processes candidate onboarding and identity verification | reads | ✓ | Review verification queue, approve/reject documents, handle escalations, override fields with audit |
| P3 | Operations Manager | Senior staff who oversees both domains and monitors operational health | ✓ | ✓ | Review dashboards, manage AI budget, respond to alerts, review audit logs |
| P4 | Org Admin | Super-user who configures the org, offices, users, product access | ✓ | ✓ | Org setup, office/user management, product access, rate cards (future), IP allowlist (IRAP) |
| P5 | Candidate (external) | Worker being recruited; interacts via token-based public routes only | ✓ | ✓ | Complete assessment (token), upload identity documents (token), receive OTPs |
| P6 | Ops / SRE | Infrastructure operator | — | — | Deploy, monitor, respond to incidents, manage provider failover |
All authenticated personas (P1–P4, P6) have org_id scoped by JWT. Tenant isolation enforced per architecture invariants (no cross-org data leakage).
4. Success Metrics¶
Measurable at go-live (Week 8 of go-live window) and after first month in production.
Functional success¶
| Metric | Target at go-live | Target at month 1 |
|---|---|---|
| Scout parity with Laravel (84 Laravel feature tests) | Equivalent Elixir tests: 400+ passing | Regression count vs Laravel: 0 |
| Verify parity with Laravel | Equivalent Elixir tests: 300+ passing | Regression count vs Laravel: 0 |
| Verification pipeline success rate (auto-complete, no manual review) | ≥70% | ≥75% |
| Manual review SLA (escalation → decision) | p95 < 24 hours (business days) | p95 < 12 hours |
| Assessment completion rate (invited candidates) | ≥60% | ≥65% |
| Agent chat task-completion rate (Tier-1 pattern match) | ≥70% queries handled at $0 | ≥75% |
Operational success¶
| Metric | Target |
|---|---|
| LiveView mount p95 | <300 ms (PF-01) |
| REST API p95 | <200 ms (PF-01) |
| Clock-in endpoint p95 (not Scout/Verify but shared infra) | <100 ms |
| Agent chat first token | <1 s |
| Agent chat full response | <3 s for short queries |
| Verify pipeline (document upload → completion) p95 | <60 s for auto-path |
| AI spend per org per month at 5K employees | <\(4K baseline; <\)2K with prompt caching in effect (AI-09) |
| Zero stack traces in production response bodies (SE-10) | 0 incidents |
| Event-store integrity: hash-chain verification | Monthly run with zero breaks |
| Tenant isolation architecture tests | 100% pass |
| Uptime (from Phase 1 commercial target) | 99.5% Phase 1; 99.9% Phase 3 |
Commercial success¶
- ASG accepts production cutover within go-live window (no rollback)
- Plan A P1 Build ($100K) milestone signed off at end of Week 12
5. Epics & User Stories¶
Stories grouped by user-value epic. Each story: As a persona, I want capability, so that value — followed by acceptance criteria (AC). Each AC is directly testable.
Stories do not re-state architectural invariants; they inherit them from architecture.md.
Epic 1 — Foundation & Access (5 stories)¶
E1.1 — As an Org Admin, I want to provision a new organisation with offices, industry profile, and initial users, so that recruitment and verification operations can begin.
- AC1: Org creation form captures name, slug, ABN, industry profile selection.
- AC2: Creating an org provisions the default product access flags (scout, verify) per subscription tier.
- AC3: Creating an org emits organisation_created event to events.domain_events.
- AC4: Slug uniqueness enforced (partial index on deleted_at IS NULL).
- AC5: Soft-deleting an org cascades to revoke all user sessions in that org within 60 s.
E1.2 — As an Org Admin, I want to add offices and assign users to them, so that staff can be scoped to the offices they serve.
- AC1: Office CRUD with name, location, timezone.
- AC2: User-office assignment with is_primary flag.
- AC3: Scout views filter by user's primary office unless user has manager+ role (which can select office).
- AC4: Office deletion blocked if any user, pool, or job order references it; must reassign first.
E1.3 — As any authenticated user, I want to log in with email + password + optional MFA, so that my session is secure.
- AC1: Login via phx.gen.auth equivalent with Argon2 hashing (SE-08).
- AC2: MFA: TOTP optional; mandatory FIDO2 for admin/payroll/director roles per Part 8 decision in architecture.
- AC3: Session timeout 60 min rolling (commercial); 15 min fixed (IRAP, but IRAP is Phase 3 not this PRD scope).
- AC4: After 5 failed login attempts in 10 min, account locked for 15 min; Ops notified.
- AC5: JWT for mobile/API use carries org_id, user_id, role claims; 1-hour access + 7-day refresh token.
- AC6: Microsoft 365 SSO option (ported from Laravel STORY-022). Users with M365 tenants log in via Entra ID OAuth2; JIT-provision on first login with role matching via email domain + org membership rules. MFA inherited from the M365 tenant. In IRAP mode M365 Entra is the only SSO path (IR-13).
E1.4 — As an Org Admin, I want to manage product access (Scout vs Verify) per user with per-office overrides, so that personas only see the features they need and office-scoped access is enforced.
- AC1: product:scout and product:verify feature flags per user with optional per-office override (ported from Laravel STORY-109). Nullable office_id on product_access row allows org-wide default with office-specific overrides.
- AC2: UI routes behind product:scout middleware reject non-Scout users with 403.
- AC3: Navigation sidebar shows only enabled products for the current office context.
- AC4: Superadmins bypass product restrictions with an explicit visible indicator ("Viewing as superadmin").
- AC5: Product access matrix admin UI (in E1.6 Admin Console) shows per-office product grid with bulk edit.
E1.5 — As a developer or auditor, I want every request, event, and job to carry a correlation ID, so that cross-system traces are possible.
- AC1: FinnestWeb.Plugs.CorrelationId generates UUID at edge or propagates inbound X-Correlation-ID.
- AC2: Correlation ID appears in events.domain_events.metadata, Oban job args, structured logs (OP-01), and agents.tool_audit.
- AC3: Response includes X-Correlation-ID header.
- AC4: Agent-initiated events propagate correlation ID; chain bounded at 10 events (AI-05).
E1.6 — As an Org Admin, I want a unified Admin Console for managing organisation, users, offices, product access, and feature flags, so that platform configuration is centralised.
- AC1: Admin Console section in navigation (ported from Laravel STORY-028, 070, 071 — organisation nav + route restructure).
- AC2: Users panel — user CRUD, role assignment, MFA reset, M365 SSO binding, deactivate/reactivate (with audit).
- AC3: Offices panel — office CRUD, user-office assignment, primary office toggle. Office Import command (ported from Laravel STORY-346): one-shot import from admin_central.office_locations via V2Repo, populates public.offices + public.id_mappings, emits office_imported events, dry-run preview before commit. Admin-initiated via panel action; also available as mix finnest.import.offices --org=<slug> for ops.
- AC4: Product Access panel — per-office product matrix (Scout × Verify × office), bulk edit, validation that at least one office has each product enabled per org.
- AC5: Feature Flags panel (ported from Laravel STORY-088/089) — per-org flag CRUD, override UI with justification, CLI tooling equivalent via mix tasks for emergency flip.
- AC6: Audit log viewer (hooks to E5.6 Scout audit trails) — filter by actor, action, date; export CSV.
- AC7: Admin Console access restricted to admin + director roles (plus superadmin).
- AC8: Every admin action emits an audit event; admin panel changes require re-authentication if session is older than 15 min.
- AC9: Templates panel — CRUD for outreach templates (E2.6a AC4-5) with live preview against a sample candidate record. Versioning visible: edits create new version (old retained for audit); activation sets "current" version per channel. Rollback to any prior version with one click. Per-channel templates (SMS / email separately). Test-send to a designated test recipient before activation. Template deletion soft-deletes only (audit-retained).
- AC10: Observe Mode log panel — captured "would-send" entries when outreach.observe_mode is enabled (see E2.6a AC10) are queryable here, filterable by date + candidate + channel, exportable as CSV.
E1.7 — As an Org Admin and Privacy Officer, I want the platform to expose Terms of Service, Privacy Policy, consent collection, and Data Subject Access Request (DSAR) mechanisms, so that ASG meets Australian Privacy Act (APP) obligations.
- AC1: Terms of Service page (ported from Laravel STORY-152) — public URL /legal/terms, versioned, accessible without login, change-log maintained.
- AC2: Privacy Policy page — public URL /legal/privacy, versioned, APP-compliant wording reviewed by legal.
- AC3: Consent collection (ported from Laravel STORY-007 privacy consent + STORY-151 consent updates) — explicit consent at: candidate assessment entry (covers E2.5), document upload (Verify), outreach opt-in for new contact channels. Consent records immutable, stored in public.consent_records with ToS/Privacy version, timestamp, IP, user agent.
- AC4: DSAR mechanism (ported from Laravel STORY-153) — authenticated user can request: (i) export of all their data (JSON + PDF summary), (ii) correction of incorrect data, (iii) deletion where legally permitted. Non-authenticated request path (email verification) also supported.
- AC5: DSAR request SLA: acknowledged within 3 business days; completed within 30 calendar days (APP 12 / APP 13).
- AC6: DSAR requests logged as events; access to the handler restricted to admin + director.
- AC7: Consent withdrawal: user can revoke outreach consent at any time (updates consent_records with a revocation event). Downstream channels check consent before delivery.
- AC8: User deletion cascade mechanics (ported from Laravel STORY-154). "Deletion" here is anonymisation, not hard-delete, because the event store is immutable by design (AR-17) and some records are legally retention-bound (payroll, tax, ATO STP):
- PII purged from user-owned records (name, email, phone, address, DOB, TFN, bank, passport, licence numbers) — replaced with [redacted-<hash>] sentinel so joins still work for analytics but no identifying data remains.
- Audit + event rows kept intact — immutable event store (AR-17) cannot be modified; anonymisation is recorded as a new user_anonymised event that downstream queries honour via a view filter.
- Session + agent memory purged — agents.sessions, agents.messages, agents.memories scoped to the user are hard-deleted (these are not immutable records).
- Uploaded documents — S3 blobs hard-deleted; verifiable_documents row soft-deleted with PII fields cleared; face comparison data scrubbed per E5.5 AC6 retention rules.
- Outreach artefacts — reach.deliveries + reach.messages payloads hashed+purged; metadata (timestamp, delivery status) retained for audit.
- Legally retained records (payroll, STP submissions, WHS incident records) — NOT deleted; flagged with anonymised_at timestamp so they're excluded from user-visible views but retained for the legally required period. Clear per-record retention table maintained by ops.
- Irreversibility — once anonymisation completes, the original values are unrecoverable even to admins. DSAR status progresses to completed with certificate (PDF) issued to the original requester.
- Dependency cascade — downstream subscribers (Admin Central sync E6.2) receive the user_anonymised event and apply corresponding anonymisation in v2 within 24 hours.
Epic 2 — Scout Recruitment Workflow (10 stories)¶
E2.1 — As a Recruitment Consultant, I want to create a job order with category-specific requirements, so that candidate scoring and outreach can proceed.
- AC1: Multi-step Job Order Wizard (replicates Laravel JobOrderWizard): basic details, category, requirements (credentials, licences, shifts), rate.
- AC2: Required credentials pre-populated from compliance.industry_profiles based on org's active profiles.
- AC3: Saving draft stores partial job order; job_order_drafted event emitted.
- AC4: Activating a job order transitions status to active and emits job_order_activated event.
- AC5: Job order ties to office; office filter applies to visibility.
E2.2 — As a Recruitment Consultant, I want to see a list of my office's job orders with filter and search, so that I can triage my workload.
- AC1: Job orders table (replicates JobOrdersTable Livewire) paginated, sortable, filterable by status, category, date.
- AC2: Default view shows active job orders; toggle for all, draft, closed, on_hold.
- AC3: Search by title, candidate name, or rate (fuzzy match).
- AC4: Bulk actions: close, put on hold, mark complete.
E2.3 — As a Recruitment Consultant, I want to trigger scoring for a job order against the available candidate pool, so that I get a ranked shortlist.
- AC1: "Run scoring" action uses finnest_recruit.ScoringEngine — deterministic weighted scoring per Laravel's ScoringService + LicenseScorer + OrderedSelectionScorer.
- AC2: Scoring is a pure function pipeline (per ADR-001-F); same input always produces same output.
- AC3: Results persisted to recruit.scoring_results with input snapshot (job order state + candidate state at time of run).
- AC4: Dry-run option available (Scout.DryRunScoringService equivalent) — estimates AI cost + predicted ranking without persisting.
- AC5: AMBER-confidence candidates (70–90%) flagged for Tier-2 agent review; GREEN (>90%) auto-shortlist; RED (<70%) excluded with reason.
- AC6: Emits scoring_completed event.
E2.4 — As a Recruitment Consultant, I want the candidate pool to stay fresh by periodically importing from admin_central and admin_atslive, so that new candidates are scoreable without manual export.
- AC1: finnest_recruit.PoolRefreshScheduler runs per Laravel's schedule (default: every 6 hours).
- AC2: Reads via Finnest.V2Repo (MyXQL, read-only; AR-05).
- AC3: id_mappings table populated for every imported candidate (public.id_mappings).
- AC4: pool_refresh_completed event emits counts (new, updated, skipped).
- AC5: Failed refresh logged + Ops alert; retry with exponential backoff (QJ-02).
- AC6: Manual "Refresh Now" trigger available from UI for consultants.
E2.5 — As a Candidate (external), I want to complete an assessment via a secure token-based link, so that I don't need to create an account.
- AC1: Assessment URL /assessment/:token accessible without auth.
- AC2: Token signed, single-use, 30-day expiry (configurable per org).
- AC3: Rate-limited per token + per IP (SE-05).
- AC4: Multi-step form (replicates Laravel AssessmentForm) supports category-aware question reveals.
- AC5: Offline-capable — form can be partially completed, resumed via same URL.
- AC6: Submission emits assessment_submitted event; scoring triggered asynchronously.
- AC7: Typeform is dropped from the stack. Laravel shipped Typeform integration (STORY-039) but Finnest will not port it. Native AssessmentForm is the sole assessment surface. Any org currently using Typeform forms migrates to native assessment at cutover — migration utility reads existing Typeform responses via admin_central for historical reference only (no ongoing Typeform API dependency).
E2.6a — As a Recruitment Consultant, I want to send SMS/email outreach to candidates with versioned, org-customisable templates, so that messaging is consistent and governed.
- AC1: Outreach action on candidate card: select template, channel (SMS/email), preview, send.
- AC2: SMS adapter: Finnest.Reach.Adapters.SMSGlobal (ported from Laravel with MAC auth).
- AC3: Email adapter: Finnest.Reach.Adapters.SES (AWS SES Sydney).
- AC4: Template system (ported from Laravel STORY-016/040/041): templates stored in reach.templates with per-org customisation layer over platform defaults. Versioning: edits create new version; old versions retained for audit.
- AC5: Template variables (e.g. {{candidate.first_name}}, {{job_order.site}}) resolved at send time with safe escaping.
- AC6: Every outreach creates a row in reach.deliveries + recruit.outreach_log and emits outreach_sent event.
- AC7: Delivery receipts from SMSGlobal + SES webhooks update delivery status.
- AC8: Rate-limited per org (budget-protective; AI-08 equivalent).
- AC9: Consent-gated — outreach to a candidate requires valid consent record for the channel (see E1.7 AC7); no-consent attempt logged and blocked.
- AC10: Observe mode (ported from Laravel STORY-038): per-org feature flag outreach.observe_mode. When enabled, outreach actions are captured but NOT delivered to external channels — the system records what WOULD have been sent to reach.observed_deliveries without incurring SMS/email costs or touching real candidates. Use cases: (a) sales demo — show outreach flow without spamming demo personas; (b) staff training — consultants practise campaigns safely; © compliance review — QA reviews outbound content before a sensitive campaign goes live; (d) dry-run of new templates before activation. Toggled by admin role via Admin Console (E1.6 AC5 feature flag panel). When observe_mode is active, every outreach action in the UI shows a distinct "OBSERVE" badge; captured entries surface in E1.6 AC10 Observe Mode log panel. Auto-disabled after configurable duration (default 7 days) to prevent long-term drift from real outreach.
- AC11: Drip campaigns (B12 M6) — deferred to Migration Phase 1; single-send only in go-live.
E2.6b — As a Recruitment Consultant, I want to send a Calendly-backed interview scheduling link to a candidate, so that the candidate can self-book without back-and-forth.
- AC1: Calendly adapter (ported from Laravel CalendlyService + CalendlyApiService, STORY-014) behind the Calendar hexagonal port.
- AC2: "Schedule Interview" action on candidate card generates a Calendly link tied to the assigned consultant's Calendly account.
- AC3: Link sent via SMS/email using an outreach template (E2.6a); emits interview_link_sent event.
- AC4: Calendly webhook ingestion (/api/v1/webhooks/calendly) handles: booking created, rescheduled, cancelled, no-show. Webhook signature verified.
- AC5: Webhook events update recruit.interviews table with booking ID, start_at, timezone, duration, meeting link; emits interview_booked / interview_rescheduled / interview_cancelled events.
- AC6: Pipeline (E2.8) reflects interview-booked state.
- AC7: Native scheduling (B12 H2) deferred to Migration Phase 1 or 2; Calendly is the go-live path.
E2.8 — As a Recruitment Consultant, I want to see the full Three-Track Pipeline for a job order, so that I can see where every candidate is in the flow.
- AC1: Pipeline view (replicates /scout/pipeline/{jobOrder}) shows columns: Eligible → Outreach → Interviewed → Placed.
- AC2: Candidates drag-droppable across columns (server-side validation); triggers corresponding event.
- AC3: Each card shows compliance indicator (from compliance_mcp.check_worker) — pre-computed at scoring time.
- AC4: Real-time updates via LiveView + PubSub (no refresh needed).
E2.9 — As a Recruitment Consultant, I want to check pool availability for a given date before committing to a placement, so that I don't over-commit candidates.
- AC1: Availability check UI (replicates PoolAvailabilityCheck).
- AC2: Query scopes to org + office + date range.
- AC3: Shows conflicts with active shifts (from finnest_roster) and unavailability records (people.unavailability) for each candidate.
- AC4: Cross-domain read via events — availability denormalised to recruit schema from roster events (no direct cross-schema JOIN).
E2.10 — As a Recruitment Consultant, I want to confirm a placement and write outcomes back to admin_central, so that v2 operational workflows (onboarding, CMS) continue uninterrupted during the transition.
- AC1: Placement confirmation (replicates ConfirmPlacement Livewire).
- AC2: Creates people.employees record (lifecycle state = onboarding) + emits placement_confirmed event.
- AC3: Finnest.ScoutAdminCentralSync writes back to admin_central candidates table + candidates_additional metadata.
- AC4: Write-back failure retries via Oban; admin alert if 3 retries fail.
- AC5: Placement status visible on candidate card and job order view.
E2.11 — As a Recruitment Consultant, I want to flag license/credential renewals coming due, so that I can proactively re-engage candidates before their credentials expire.
- AC1: Credential expiry widget (replicates FlagLsqRenewal) on pool view.
- AC2: Shows credentials expiring within 30 / 14 / 7 days for the org's candidate pool.
- AC3: One-click outreach template for renewal notification.
- AC4: Tier-3 autonomous Compliance Monitor (finnest_pulse) feeds this view.
E2.12 — As a Recruitment Consultant and Operations Manager, I want pool analytics that show how a candidate list was constructed and filtered, so that I can understand why candidates appeared or didn't.
- AC1: On-hold filter (ported from Laravel STORY-078): exclude candidates with on_hold = true by default; optional toggle to show on-hold with reason badge.
- AC2: Pool construction waterfall (ported from STORY-079): visual chart showing total-in-pool → filtered-by-location → filtered-by-credentials → filtered-by-availability → final-shortlist counts per stage.
- AC3: Selection criteria badges (ported from STORY-080): each candidate card shows badges indicating which criteria matched (credential ✓, location ✓, availability ✓, score-tier, etc.).
- AC4: Bucket breakdown (ported from STORY-081): pool view shows distribution across B1 (Ready) / B2 (LSQ Review) / B3 (Placed) buckets with drill-through.
- AC5: Analytics exported as CSV for offline analysis.
- AC6: Pool analytics view real-time via PubSub on scoring/placement events.
Epic 3 — Scout Billing & Analytics (4 stories)¶
E3.1 — As an Operations Manager, I want per-office billing snapshots, so that cost-of-use is attributable by business unit.
- AC1: BillingSnapshot equivalent — per-office monthly snapshot of assessments sent, outreach sent, candidates placed, AI spend.
- AC2: Snapshot generated by Oban cron on last day of month; emits billing_snapshot_created event.
- AC3: Per-category breakdown (BillingSnapshotCategory).
- AC4: Exportable as CSV for finance reconciliation.
E3.2 — As a Recruitment Consultant, I want a Scout dashboard with KPIs, so that I can monitor my placement velocity and pipeline health.
- AC1: Dashboard (replicates DashboardCharts + /scout/dashboard) shows: active job orders, candidates in pipeline by stage, placements this week/month, average time-to-place, outreach delivery rate.
- AC2: Real-time updates via PubSub on every relevant domain event.
- AC3: Office filter and date range selector.
E3.3 — As an Operations Manager, I want pool analytics at the office level, so that I can see source effectiveness and pool health.
- AC1: Pool analytics view (replicates PoolAnalyticsCharts + InsightsCharts) shows candidate counts by source, outreach conversion, pool freshness distribution.
- AC2: Export as CSV.
- AC3: Source effectiveness (B12 H10) tracks which channel (inbound vs outreach vs direct application) produced best-scored candidates.
E3.4 — As an Operations Manager, I want a go-live readiness check to validate an org is ready to enable Scout, so that activation doesn't fail due to missing configuration.
- AC1: Finnest.Scout.GoLiveCheck equivalent — runs per-org readiness checks: org has offices, office has users, industry profile set, credential registry seeded, at least one candidate in pool.
- AC2: Check exposed as admin action and as scheduled nightly (Tier-3 Data Quality Agent).
- AC3: Readiness failure generates a task list for the org admin.
- AC4: Demo-mode seeders ported from Laravel: Finnest.Scout.DemoSeeder (equivalent of STORY-132 — job orders, pipeline, billing, users, candidates), Finnest.Verify.DemoSeeder (equivalent of STORY-133 — all pipeline stages + review queue). Demo data lives under demo_* schema isolation so production-mode queries never see it. Reset command available to ops.
Epic 4 — Verify Document Verification Pipeline (8 stories)¶
E4.1 — As an Onboarding Coordinator or Candidate, I want to upload identity documents (passport, driver's licence, Medicare, etc.) so that identity can be verified automatically.
- AC1: Upload UI for staff: candidate profile → Upload Document (select type, browse file, submit).
- AC2: Upload via public token link for candidate self-service (demo link flow per VerifyDemoToken).
- AC3: File types accepted: JPG, PNG, PDF (≤10 MB).
- AC4: Upload enqueues classification stage job; emits document_uploaded event.
- AC5: File stored in S3 via FileStorage port (ap-southeast-2).
- AC6: Sensitive document numbers encrypted with Cloak (see data.md).
E4.2 — As an Onboarding Coordinator, I want the 5-stage verification pipeline to run automatically after upload, so that most documents are verified without my intervention.
- AC1: Pipeline stages: classification → extraction → validation → face_verification → cross_reference (replicates Laravel VerificationStage state machine).
- AC2: Each stage runs as an Oban job; state transitions emit events.
- AC3: Stage failure puts verification in needs_review state; adds to manual review queue.
- AC4: AI stages (classification, extraction, face verification, cross-reference) use AiProvider port — Bedrock Sydney primary (ADR-0010), Vertex AU fallback, per-stage failover (AI-07).
- AC5: Validation stage is deterministic (business rules per document type, per B05 credential_types).
- AC6: Pipeline duration p95 <60 s for auto-completion path.
- AC7: Concurrency locking (ported from Laravel STORY-147): Oban unique: [fields: [:aggregate_id, :stage]] prevents the same (verification_id, stage) pair from running twice concurrently. Retries are idempotent (QJ-01) — a retry that sees stage-already-completed short-circuits to the next stage without re-running the AI call.
- AC8: Concurrent document uploads for the same candidate queue sequentially by candidate_id to prevent identity-checklist race conditions.
E4.3 — As an Onboarding Coordinator, I want real-time status updates on verifications I own, so that I don't have to poll.
- AC1: Verification detail view uses LiveView + PubSub; updates without refresh.
- AC2: Status badge (replicates VerificationStatusBadge): Classifying / Extracting / Validating / Face Verifying / Cross-Referencing / Completed / Failed / Needs Review.
- AC3: Each status change emits verification_stage_transitioned event.
E4.4 — As an Onboarding Coordinator, I want to see an aggregated Identity Checklist for a candidate, so that I know at a glance whether they meet identity requirements.
- AC1: Identity Checklist widget (replicates IdentityChecklistWidget).
- AC2: Auto-populated from verified documents per role mapping: commencement (birth cert, passport, citizenship, VEVO, ImmiCard), primary (driver's licence, overseas passport, proof-of-age), secondary (Medicare, bank, credit card, student ID, security licence), selfie.
- AC3: Completion percentage shown; incomplete roles highlighted.
- AC4: Listener Finnest.Verify.UpdateIdentityChecklist runs on each stage completion.
E4.5 — As a Candidate, I want to receive an OTP via SMS or email when required, so that I can confirm my identity for sensitive operations.
- AC1: OTP generation via finnest_verify.OtpVerification; 6-digit code, 5-min expiry.
- AC2: Delivery via Finnest.Reach.SMS (SMSGlobal) or email (SES).
- AC3: Rate limits (ported from Laravel STORY-103):
- Per candidate: max 3 attempts before 10-min lockout
- Per IP: max 10 OTP requests per hour (SE-05)
- Resend cooldown: 60 seconds between resends
- Per-candidate send cap: 3 OTPs per hour (prevents enumeration spam)
- AC4: Lockout state emitted as otp_locked_out event; unlock after cooldown window elapses.
- AC5: All OTP events emit otp_sent / otp_verified / otp_failed / otp_locked_out.
- AC6: Dedicated OtpCodeEntry Livewire component (ported from Laravel) with countdown timer, paste-detection for 6-digit codes, and clear error states.
- AC7: OTP content logged by hash only (Commandment #24); code value never appears in logs.
E4.6 — As an Ops / Sales Enablement user and Candidate (demo), I want a complete demo/sandbox mode that covers Scout + Verify end-to-end with realistic data, so that the platform can be demonstrated to prospects and used for training without touching production data.
- AC1: Demo seeders (ported from Laravel STORY-131 through STORY-136): Finnest.Scout.DemoSeeder + Finnest.Verify.DemoSeeder populate a realistic ~5,000-candidate pool, 20–50 job orders across categories, pipeline in mixed states, pre-run Verify stages, billing snapshots, users with @demo.local email domain.
- AC2: Schema isolation — demo data lives under demo_* schema prefixes (or a public.demo_orgs flagged boundary); production-mode queries filter demo orgs out automatically.
- AC3: Verify public demo link (replicates VerifyDemoToken, ported as STORY-133): token-based public route /verify/demo/:token, single-use, 48-hour expiry, throttled per org per day, feature-flagged via verify_demo_enabled.
- AC4: Demo user provisioning — @demo.local users bypass M365 SSO, use simplified auth for presenter workflows; demo users cannot escalate to admin.
- AC5: Reset command (ported from STORY-134 equivalent): mix finnest.demo.reset wipes demo schema and re-seeds from deterministic seeds; safe to run in production because of schema isolation.
- AC6: Demo outreach sends go to a controlled inbox (not real candidates); controlled via outreach.observe_mode per E2.6a AC10.
- AC7: Demo mode access tagged in audit log — every action in demo space carries a demo: true marker so analysts can filter demo activity out of real metrics.
E4.7 — As an Ops SRE, I want per-org AI budget limits per stage with explicit alert thresholds, so that runaway costs are contained and operators are informed in time to act.
- AC1: AiBudgetLimit per (org, stage, provider) — daily, weekly, monthly (mirrors Laravel schema).
- AC2: Finnest.Verify.BudgetGuard GenServer checks before every AI call; rejects with :budget_exceeded if over limit.
- AC3: Alert thresholds (ported from Laravel STORY-148):
- 50% of monthly limit → info log, no alert
- 80% → warning log + email to admin + UI banner for org users
- 95% → critical alert to Ops on-call (PagerDuty) + admin push
- 100% → circuit breaker opens; stage switches to manual_review fallback; admin + Ops notified
- AC4: Budget resets per period boundary (daily at 00:00 AEST; weekly Monday 00:00; monthly 1st 00:00).
- AC5: All AI calls logged to agents.tool_audit with cost + tokens + provider + correlation_id.
- AC6: Budget change (admin-initiated) emits budget_updated event with old/new/justification; circuit breaker re-evaluates immediately.
- AC7: Circuit-open state visible as a warning on the AdminConsole → Product Access page until resolved.
E4.8 — As an Ops SRE, I want provider health monitoring with automatic failover along an explicit chain, so that one provider outage doesn't halt verification.
- AC1: Finnest.Verify.ProviderHealthCheck runs periodic pings (every 60 s) to each configured AI provider.
- AC2: Health status per provider persisted in AiProvider schema; failure streak ≥3 in 5 min triggers failover.
- AC3: Failover chain per stage (ported from Laravel STORY-149, matches AI-07):
- Classification: Bedrock Sydney (Claude Haiku) → Vertex AU (Gemini Flash-Lite) → fail-to-manual-review
- Extraction: Bedrock Sydney (Claude Sonnet) → Vertex AU (Gemini Flash) → fail-to-manual-review
- Face verification: Bedrock Sydney (Claude Sonnet) → no automatic fallback → fail-to-manual-review (accuracy critical per ADR-0010)
- Cross-reference: Bedrock Sydney (Claude Sonnet) → Vertex AU (Gemini Flash) → fail-to-manual-review
- AC4: Failover event emits provider_failover with reason + recovery plan; Ops alerted.
- AC5: After primary recovers and passes 5 sustained minutes of healthy pings, traffic shifts back gradually (50% for 5 min, then 100%).
- AC6: Provider health dashboard (in Admin Console) shows current/historical health per provider with failover events overlaid.
- AC7: Manual circuit-reset action available to Ops via admin CLI (mix finnest.verify.reset_circuit --provider=bedrock --stage=extraction).
Epic 5 — Manual Review, Overrides & Audit Trails (6 stories)¶
Scope note: this epic now covers Scout audit trails in addition to Verify manual review, to give symmetric audit coverage across both domains.
E5.1 — As an Onboarding Coordinator, I want a manual review queue of escalated verifications, so that I can efficiently triage edge cases.
- AC1: Review queue view (replicates ReviewQueue) — list of verifications in needs_review state.
- AC2: Filterable by: age of escalation, stage that escalated, candidate name, org, reason.
- AC3: Sorted by SLA age by default (oldest first).
- AC4: Badge counter (replicates ReviewCounter) shows unread in navigation.
- AC5: Assignment: claim-based — clicking "Open" assigns to current user; visible to other staff as claimed.
E5.2 — As an Onboarding Coordinator, I want to approve, reject, or request re-upload of an escalated document, so that the candidate can progress or remediate.
- AC1: Detail view shows document, extracted fields, face comparison result, cross-reference findings, confidence scores.
- AC2: Actions: Approve / Reject / Request Re-upload / Ignore-for-now.
- AC3: Approval transitions verification to completed; emits verification_approved.
- AC4: Rejection requires reason (enum); emits verification_rejected; notifies candidate (if staff-driven reason warrants).
- AC5: Re-upload request emits reupload_requested; sends SMS/email with new secure upload link.
- AC6: Every action logged to verify_audit_log (replicates Laravel behaviour).
E5.3 — As an Onboarding Coordinator, I want to override a single extracted field with correct value, so that OCR errors don't force an entire re-upload.
- AC1: Field override (replicates VerifiedFieldOverrideController + override-field action, ported from Laravel STORY-106).
- AC2: Override stores old + new value; writes audit row to verify_audit_log.
- AC3: Override requires justification note (free text ≥10 chars).
- AC4: Downstream stages (validation, cross-reference) re-run with overridden value.
- AC5: Overrides never modify events.domain_events rows (AR-17); recorded as separate field_overridden event.
- AC6: Non-repudiable audit trail: every override row captures — old_value (Cloak-encrypted), new_value (Cloak-encrypted), field_name, actor_user_id, actor_role, timestamp, justification, correlation_id, IP, user_agent. All fields mandatory.
- AC7: Field confidence badges (ported from STORY-106): verified fields display a confidence badge (high / medium / low) derived from extraction confidence score. Low-confidence fields auto-flagged for review queue. Badges visible to Coordinators but not Candidates.
- AC8: Overriding a high-confidence field requires a second-person confirmation (4-eyes principle) when value delta > configured threshold.
E5.4 — As an Onboarding Coordinator, I want to bulk-approve a list of verifications that share a common pattern (e.g. same office batch), so that repetitive queue work is fast. - AC1: Bulk-select in review queue; max 50 per batch. - AC2: "Bulk Approve" requires confirmation; justification note required (per-batch). - AC3: Each verification still individually audited (one event per verification). - AC4: Partial failure in batch reported per-row.
E5.5 — As an Ops Manager or auditor, I want an append-only audit log for every Verify action with explicit retention mechanics, so that IRAP compliance + disputes are traceable and retention policy is enforced not just documented.
- AC1: verify_audit_log schema (replicates Laravel VerifyAuditLog) captures: action, actor, target, old value, new value, correlation_id, timestamp.
- AC2: Append-only (no UPDATE/DELETE).
- AC3: Viewable by Ops Manager with filter by org, candidate, actor, action type.
- AC4: Exportable as CSV for compliance review.
- AC5: Retained per data retention matrix (90 days commercial, 7 years IRAP — see data.md).
- AC6: Document retention mechanics (ported from Laravel STORY-105):
- verifiable_documents.retention_expires_at column set on upload based on org's retention policy
- Nightly Oban worker deletes S3 blob and soft-deletes DB row after retention_expires_at passes
- Face comparison data scrubbed separately (tighter retention than document — 30 days commercial / 1 year IRAP)
- HMAC-SHA256 integrity hash stored on every document upload; verified on retrieval
- Retention expiry events emitted so downstream systems (identity checklist) can re-evaluate
- AC7: Retention policy override per org (admin action, audit-logged, requires justification).
E5.6 — As an Ops Manager or auditor, I want a corresponding audit log for Scout actions, so that placements, scoring overrides, job order transitions, and candidate profile edits are traceable with the same guarantees as Verify.
- AC1: scout_audit_log schema (symmetric with verify_audit_log, ported from Laravel STORY-130 governance work).
- AC2: Actions audited: placement confirmation, placement cancellation, scoring run, scoring override (manual band adjustment), job order transitions (draft/active/on_hold/closed/complete), candidate profile edits (name, email, phone, employment status), pool membership changes (add/remove/on-hold), outreach sends.
- AC3: Append-only (no UPDATE/DELETE); trigger-enforced.
- AC4: Each entry: action, actor_user_id, actor_role, target (job_order_id or candidate_id), old value (where applicable), new value, justification (for overrides), correlation_id, timestamp.
- AC5: Admin Console audit viewer (E1.6 AC6) queries both scout_audit_log and verify_audit_log; unified filter UI.
- AC6: Retention same as Verify audit (AC5 above).
- AC7: Scoring override requires justification (≥10 chars); non-override scoring runs recorded with system actor.
Epic 6 — Scout & Verify Admin Central Sync (3 stories)¶
E6.1 — As an Ops SRE, I want Scout placements written back to v2 admin_central, so that v2 continues to function during Strangler Fig migration (ADR-010-F Phase 1).
- AC1: Finnest.Scout.AsgCentralSync writes to admin_central.candidates + candidates_additional metadata.
- AC2: Runs async via Oban worker; idempotent (QJ-01).
- AC3: Failure retries with exponential backoff; DLQ alert after 5 retries.
- AC4: Sync emits admin_central_sync_completed events.
- AC5: Write permissions to v2 admin_central are held by this sync service only; other v2 access is read-only (AR-05).
E6.2 — As an Ops SRE, I want Verify completed verifications written back to v2 admin_central, so that v2 onboarding workflows can proceed using Finnest-verified identity data.
- AC1: Finnest.Verify.SyncVerifiedDataToAdminCentral listener (replicates Laravel orchestration).
- AC2: Triggered on verification_completed events.
- AC3: Writes verified fields back to admin_central.candidates_qualification + related tables per existing Laravel mapping.
- AC4: Same retry + DLQ pattern as E6.1.
E6.3 — As a developer or auditor, I want the sync to tolerate v2 schema drift without silent data loss, so that unexpected v2 changes are caught before they cause downstream issues.
- AC1: Sync target columns validated against admin_central schema at startup; mismatch aborts sync with Ops alert.
- AC2: Unknown or extra fields logged with severity; sync continues with known fields.
- AC3: Every sync attempt recorded with input snapshot + result; queryable for forensic review.
Epic 7 — AI Cost, Budget & Prompt Caching (4 stories)¶
E7.1 — As an Operations Manager, I want a real-time AI spend dashboard per org, so that I can see budget burn before it becomes a problem.
- AC1: Dashboard (Finnest-lean version of the Grafana panel planned in main architecture) shows: daily/weekly/monthly spend, 80% warning state, 100% circuit-breaker-open state, top callers by stage.
- AC2: Updates in real-time via PubSub on agents.tool_audit inserts.
- AC3: Export cost report (replicates Laravel CostReportService) as CSV.
- AC4: Per-provider and per-stage breakdown available.
E7.2 — As an Operations Manager, I want per-org AI budget limits that I can adjust, so that budget reflects the org's subscription tier and usage pattern.
- AC1: Budget CRUD per (org, stage, provider) — daily, weekly, monthly.
- AC2: Budget change emits budget_updated event; circuit breaker re-evaluates immediately.
- AC3: Changes require admin role + justification note (audit).
- AC4: Budget breach history visible as a separate timeline view.
E7.3 — As a developer, I want prompt caching correctly structured on every Anthropic call, so that we capture the 90% cache discount.
- AC1: All Claude API calls structure the request per AI-09 (new guardrail, locked in architecture Part 3 + invariant #15): system prompt (permanent cache) → MCP tool schemas (permanent cache) → org context (per-org cache) → session history + user query (uncached).
- AC2: Finnest.Agents.PromptCache GenServer aggregates cache_creation_input_tokens, cache_read_input_tokens metrics.
- AC3: Cache hit rate dashboard panel; alert if <50% for 1 hour (likely means someone broke prompt structure).
- AC4: Target hit rate: ≥70% on cacheable content.
E7.4 — As an Ops Manager, I want provider failover to happen transparently when Bedrock Sydney has an incident, so that verification doesn't halt.
- AC1: AiProvider behaviour (ADR-006-F) used for every AI call; circuit breaker per-provider.
- AC2: Failover chain for Verify: Bedrock Sydney → Vertex AU → manual review (AI-07).
- AC3: Failover is logged + emits event; Ops alerted.
- AC4: After primary recovers and passes health check sustained 5 min, traffic shifts back gradually.
Epic 8 — Agent Chat & Command Bar (6 stories — NEW features not in Laravel)¶
E8.1 — As any authenticated user, I want an agent chat on my home screen, so that I can ask natural-language questions about my work instead of navigating screens.
- AC1: Home screen has persistent chat input; chat history scroll.
- AC2: Chat session backed by agents.sessions; messages in agents.messages.
- AC3: Session persists on reconnect (<10 min); new session after that, with L2 tenant memory available.
- AC4: Streaming responses via Phoenix Channel (Part 6 decision) — agent:<session_id> topic.
- AC5: org_id injected by MCP framework (AI-03); user cannot supply it.
E8.2 — As a Recruitment Consultant, I want to ask "fill this job order" via agent chat, so that the agent proposes candidates with AMBER/GREEN confidence labels.
- AC1: Intent "fill / score / candidates for [job order]" routed to Tier-2 workflow agent.
- AC2: Agent invokes recruit_mcp.score_candidates + compliance_mcp.check_worker per candidate.
- AC3: Proposal returned with confidence bands (GREEN/AMBER/RED) + reasoning per candidate.
- AC4: User can "Accept all GREEN" with single click → creates placement proposals (still requires manual placement_confirmed per E2.10).
E8.3 — As an Onboarding Coordinator, I want to ask "status of John Smith's verification" via agent chat, so that I don't navigate to Verify queue for simple queries.
- AC1: Tier-1 pattern match routes to verify_mcp.get_verification_status (no LLM, $0 cost).
- AC2: Response includes current stage, age, next action, any blocking issues.
- AC3: Works across all verifications the user's role permits viewing.
E8.4 — As any authenticated user, I want a command bar (Cmd+K / Ctrl+K) accessible on every screen, so that I can navigate or act without using the sidebar. - AC1: Command bar overlay on every authenticated screen. - AC2: Supports: navigation ("go to pool"), queries ("show my expiring credentials"), actions ("approve all"), search ("John Smith"). - AC3: Keyboard navigable; Enter executes; Esc closes. - AC4: Recent actions cached for quick access. - AC5: Touch-accessible (44px targets) — FE-05.
E8.5 — As a Recruitment Consultant, I want conversational forms where the agent pre-fills and I review, so that bulk tasks are faster than typing every field.
- AC1: "Create job order for 3 forklift ops at Woolworths Minchinbury Monday 6am" → agent pre-fills JobOrderWizard.
- AC2: Pre-filled form shown for review; user edits or submits.
- AC3: Agent tool recruit_prepare_job_order (PROPOSE category) returns proposed fields.
- AC4: Submission goes through normal recruit.create_job_order context (gates, validation, events).
E8.6 — As any authenticated user, I want my agent interactions logged (tool calls, not message content), so that compliance + debugging are possible.
- AC1: Every tool call writes to agents.tool_audit with input_hash + output_hash (not plaintext — PII safe, per Commandment #24).
- AC2: Session messages stored in agents.messages — retention per data retention matrix.
- AC3: LLM calls log prompt_hash + response_hash + model + tokens + cache stats + cost.
- AC4: Correlation ID links every chain.
Epic 9 — Security, Privacy & Compliance Hardening (5 stories)¶
Scope note: system-level hardening that applies across Scout, Verify, and shared infrastructure. Complements Epic 1.7 (user-facing consent/DSAR) with the infrastructure enforcement. Ports the security remediation work completed in Laravel (STORY-140 through STORY-144) plus privacy audit (STORY-007) and governance (STORY-130).
E9.1 — As a developer, I want IDOR (Insecure Direct Object Reference) protections enforced at the Phoenix controller / LiveView layer, so that a user cannot access another user's or org's resources by URL manipulation.
- AC1: Auth scoping plug (FinnestWeb.Plugs.ResourceScope) runs after Tenant plug and verifies that the :id param in the route resolves to a resource in the current org_id scope.
- AC2: Any resource lookup that would cross tenant boundary returns 404 (not 403) — don't leak existence of other-tenant resources (ported from Laravel STORY-140).
- AC3: Architecture test: every route with :id or :uuid parameter uses ResourceScope or an equivalent guard.
- AC4: Repo layer tenant enforcement (custom prepare_query per data.md) is the defence-in-depth second layer; if plug is bypassed, queries still scope by org_id.
- AC5: IDOR attempt (cross-tenant access) logged as a security event; >3 attempts by one user in 10 min triggers account lockout.
E9.2 — As a developer, I want rate limiting with explicit boundaries configured per route category, so that abuse and credential-stuffing are contained.
- AC1: Hammer-backed rate limiter (ported from Laravel STORY-142) configured per route category:
- Public (no auth) — e.g. assessment form submission, demo link, webhook: 10/min per IP
- Auth (login, OTP request, MFA verify): 5/min per IP + 10/hour per account
- Authenticated API: 100/min per user
- Agent chat endpoints: 30/min per session
- Admin actions: 60/min per user
- AC2: Limit breach → 429 Retry-After with structured error body.
- AC3: Rate limit state per tenant — one org's usage doesn't throttle another.
- AC4: Admin override (emergency bypass for a single user/org) with 1-hour max duration, requires justification, audit-logged.
- AC5: Rate-limit metrics exported to Grafana dashboard.
E9.3 — As a developer, I want sensitive hidden form fields and request parameters encrypted or signed, so that hidden state cannot be tampered with client-side.
- AC1: Phoenix.Token signed payloads for any hidden state round-tripped through client forms (ported from Laravel STORY-144 hidden-field encryption work).
- AC2: Candidate assessment token + job-order context signed and verified on submission; tamper attempts rejected with 400.
- AC3: Verify demo tokens signed; altered tokens fail signature check.
- AC4: Any Livewire mount that carries sensitive server state uses signed assigns not raw params.
E9.4 — As an Ops SRE, I want logs PII-scrubbed and structured, so that production debugging doesn't create a PII exposure surface.
- AC1: Logger filter (ported from Laravel STORY-143 PII remediation) strips: email addresses, phone numbers, full names, TFN, bank account numbers, passport/licence numbers, DOB, street addresses from log lines before they reach the logging backend.
- AC2: Log lines reference entities by UUID only (candidate_id=uuid-x, never candidate=John Smith). Commandment #24.
- AC3: Structured logs in JSON (logger_json) — every line has: timestamp, level, correlation_id, org_id (UUID), user_id (UUID or null), module, message, context (map, also scrubbed).
- AC4: Gitleaks check in CI (SE-12) extended with a regex ruleset that flags PII-looking values in log statements (email pattern, phone pattern, etc.).
- AC5: Sentry client configured with before_send hook that re-scrubs PII even if it leaked into an exception message.
E9.5 — As an Org Admin and Candidate, I want privacy consent and DSAR enforcement at the system level, so that the system structurally cannot process data without valid consent.
- AC1: Consent gate (system-level enforcement of E1.7 AC3): before any processing that touches personally-identifying data (outreach send, Verify stage kick-off, scoring that uses PII), a middleware checks consent_records for a valid non-revoked consent of the relevant purpose.
- AC2: Missing consent → processing short-circuits with a specific error; event emitted (consent_missing); ops alert if the pattern recurs.
- AC3: Consent revocation (E1.7 AC7) triggers a background job that: (a) stops any in-flight processing that relied on the revoked consent, (b) purges outreach queues for the candidate, © emits consent_revoked event.
- AC4: DSAR handler (E1.7 AC4) pulls data across all relevant schemas (people, recruit, onboard, roster, timekeep reach — even if some are out-of-scope for go-live, the handler is designed to extend).
- AC5: DSAR deletion path enforces retention rules — data legally required to be retained (e.g. payroll records per ATO) is NOT deleted, only the user-visible personal data. Redaction approach documented with legal.
- AC6: Hash-chain verification (per architecture.md event-store hash chain) runs monthly and reports integrity state to Ops.
Epic 10 — Operational Readiness & Observability (4 stories)¶
E10.1 — As an Ops SRE, I want /health (shallow liveness) and /ready (deep readiness) endpoints, so that deployment health checks and load-balancer probes are reliable.
- AC1: /api/v1/health — returns 200 if Phoenix is up. No DB check. <50 ms.
- AC2: /api/v1/ready — returns 200 only if DB connection + Oban + primary AI provider reachable. <200 ms.
- AC3: /ready returns 503 with structured error if any downstream is down.
- AC4: Load balancer uses /ready for deployment rollover; rolls back on failure (IN-08).
E10.2 — As an Ops SRE, I want structured logs with correlation IDs and PII-free content, so that production debugging is possible without creating PII exposure.
- AC1: logger_json backend outputs JSON with: timestamp, level, module, correlation_id, org_id (not user name), log message.
- AC2: Log line "log by ID only" — grep for Log:: calls reveals no email/phone/name variables (OP-01; Commandment #24).
- AC3: Debug-level logs disabled in production (SE-06).
- AC4: Log retention per environment (90 days commercial; 7 years IRAP).
E10.3 — As an Ops SRE, I want OpenTelemetry spans + Sentry error tracking configured for Scout + Verify paths, so that incidents are detectable and traceable.
- AC1: Every Phoenix controller + LiveView mount + domain context function produces an OTel span (OP-03).
- AC2: Spans include org_id, user_id (hashed), correlation_id attributes.
- AC3: Sentry captures all uncaught exceptions with breadcrumbs; production-only (OP-04).
- AC4: Dashboards (Grafana) show: request rate, error rate, p95 latency per endpoint, per module.
E10.4 — As an Ops SRE, I want Oban queues visible + alertable, so that job backlogs don't go unnoticed.
- AC1: Oban Web dashboard deployed + accessible to Ops role (QJ-05).
- AC2: Per-queue alerts on queue depth + DLQ accumulation (QJ-03).
- AC3: Job failure rate > 5% triggers warning; > 20% triggers critical.
- AC4: Starting queues in go-live: scout_queue, verify_queue, reach_queue, default_queue.
6. Technical Constraints & Dependencies¶
- Delivered on Finnest/Elixir stack (ADR-001-F); all architectural invariants apply (see
architecture.md§Architectural Invariants). - External data reads via
Finnest.V2Repo(MyXQL, read-only; AR-05, DA-01). - Sensitive PII columns (TFN, bank, passport, MFA secret) encrypted with Cloak AES-256-GCM (SE-13).
- Every write goes through
Repo.transaction/1with event emission (architectural invariant #1). - All cross-domain interaction via events (ADR-005-F); no direct imports across OTP apps (AR-07, enforced by
Boundarylibrary — AR-08). - Tests: test-first (CQ-03); full suite passes before merge (CQ-02); ≥700 tests by end of Week 8 for combined Scout + Verify + infrastructure.
- Deployment: Kamal (commercial path; ADR-007-F's IRAP path not in this PRD scope).
- Secrets: Bitwarden Secrets Manager via
bwsCLI (carry-forward from AgenticAI-app); all credentials pulled at deploy time.
External dependencies (new or carried-over)¶
| Dependency | Purpose | Status at PRD time |
|---|---|---|
Bedrock Sydney (AiProvider) |
AI calls for Verify stages | Active; validated in Phase 0 |
Vertex AU (AiProvider fallback) |
Verify fallback only | Credentials provisioned Phase 0 |
| SMSGlobal API | Outreach SMS + OTPs | Active; credentials carry-over |
AWS SES Sydney (CommEmail) |
Outreach email + OTPs | Active; carry-over |
Calendly API (Calendar port adapter) |
Interview scheduling — port from Laravel CalendlyService + CalendlyApiService (STORY-014 completed). Laravel integration wraps booking creation, webhook ingestion (funnel monitoring), and link generation. Elixir adapter sits behind Finnest.Integration.Calendar.CalendlyAdapter per Calendar port. Deferred replacement to native scheduling (B12 H2) in Migration Phase 1 or 2. |
Credentials carry-over; Elixir adapter built Week 7 of go-live |
| KeyPay API | Award interpretation (not in this PRD, but prerequisite for roster/payroll) | Validated in Phase 0 |
| AWS S3 ap-southeast-2 | Document storage | Active |
admin_central MySQL (v2) |
Read pool + write placements/verifications | Connection validated in Week 1 |
admin_atslive MySQL (v2) |
Candidate pool refresh | Connection validated in Week 1 |
| Bitwarden Secrets Manager | Deploy-time secret retrieval | Active in AgenticAI-app; carry-over |
| CrimeCheck API | Police / background check stub (Verify cross-reference stage) — Laravel StubCrimeCheck + live integration (STORY-107 completed). Port pattern: hexagonal adapter under Government port family. |
Active; carry-over |
| VEVO (Visa Entitlement Verification Online) | Right-to-work verification (Verify cross-reference stage) — Laravel StubVevoCheck + live integration. Port pattern: hexagonal adapter under Government port family. |
Active; carry-over |
7. Success & Cutover Criteria¶
Pre-cutover gate (end of Week 7)¶
All must be true before promoting to production:
- All user stories have passing acceptance criteria tests (≥700 tests green)
-
mix credo --strict,mix dialyzer,mix sobelowall green (CQ-09, CQ-20, SE-11) -
mix boundarypasses — no cross-domain violations (AR-08) - Playwright E2E covers 5 critical journeys: login → dashboard, create job order → score, outreach campaign, document upload → verification completed, manual review approve
- Accessibility audit passed — axe-core CI run reports zero violations on WCAG 2.1 AA (FE-06, CQM-10); manual screen-reader pass (VoiceOver + TalkBack) against login → assessment → clock-in flow completes without blockers; keyboard-only navigation verified on critical journeys
- Load test (k6): 200 concurrent users, p95 <300 ms LiveView mount, <200 ms API (PF-01)
- AI budget circuit breaker verified under simulated cost spike
- Admin Central write-back verified against staging v2 copy
- Security review: pen test light (internal), no critical findings
- Backup + restore drill: full RDS snapshot → restore to staging — completes <4h
- Runbook for "Bedrock outage" + "admin_central connection failure" drafted
- Sentry + Grafana + Oban Web all wired and showing data from staging
Cutover procedure (Week 8)¶
- Freeze v2 for recruitment + verification workflows for the cutover window (48 hours).
- Final data sync — one-shot pull from
admin_central+admin_atslive→ Finnest. - Validation — row-count parity check; statistical diff on monetary columns; sample 100 records for field-level equality.
- DNS cutover —
app.agentic-ai.aunow resolves to Finnest. - Smoke tests — 10 user journeys run by team; every journey green before public release.
- Enable write-back — sync starts one-way: Finnest →
admin_central. - Monitor for 72 h — on-call coverage 24/7; Sentry alerts reviewed hourly.
- Decision gate (Week 8 + 72 h): Go / Rollback.
Rollback (Plan B — safety net)¶
- DNS reverts to Laravel AgenticAI-app (kept operational through Week 8 + 30 days).
- Finnest event store preserved for forensic review.
- Data diff generated (Finnest events vs v2 state) for reconciliation.
Go-live success declared when:¶
- All cutover gates passed.
- 30 days in production with no P0/P1 incidents.
- ASG signs off Plan A P1 Build milestone ($100K).
8. Out of Scope¶
Explicitly not in this PRD / go-live:
- All other Finnest modules —
finnest_people(CMS),finnest_roster,finnest_timekeep,finnest_reach(beyond outreach send-only),finnest_payroll,finnest_clients(beyond basic sites),finnest_safety,finnest_assets,finnest_quotes,finnest_learn,finnest_benefits,finnest_fatigue,finnest_clearance,finnest_performance. Covered in later migration-phase PRDs. - Flutter mobile app — Phase 2 (Migration Phase X). No mobile surface in this go-live.
- IRAP deployment — Phase 3 (Weeks 21+). Commercial deployment only.
- Native award engine — Phase 2 of KeyPay strategy (ADR-009-F). KeyPay handles all awards for now.
- Full drip campaigns (B12 M6) — single-send outreach only in go-live; drip deferred to Migration Phase 1.
- Bias detection / anonymised screening (B12 M4) — deferred.
- Candidate rediscovery (B12 M5) — deferred.
- Job board multi-posting (B12 H3) — deferred to Migration Phase 1 (post-go-live).
- Interview scheduling with calendar sync (B12 H2) — Calendly carry-over for go-live; native scheduling Phase 2.
- DVS integration (B12 C3) — deferred to Migration Phase 2 (Onboarding).
- Liveness detection (B12 C4) — deferred to Migration Phase 2.
- PEP / sanctions screening (B12 M8) — deferred.
- Ongoing monitoring / re-screening (B12 H9) — deferred.
- Custom form builder (B12 M14) — deferred.
- Typeform migration from candidates forms — Typeform dependency dropped entirely; native assessment system is the replacement (no sync-from-Typeform scripts ported).
- Calendly replacement — Calendly remains integrated for go-live; native scheduling replaces it in Migration Phase 1 or 2.
- Guided tours / welcome walkthroughs / embedded HTML user guide (Laravel STORY-022/023/024/025/026 completed but not ported). Per brainstorm-09 UI vision, the AI agent acts as contextual tutor (agent-as-tutor pattern) — the UI IS the training. Maintaining both separate tours and an agent chat is redundant (Commandment #4 YAGNI). If post-launch data shows agent chat isn't sufficient for onboarding new users, tours can be re-introduced in Migration Phase 1 or 2.
9. Risks & Mitigations¶
| Risk | Impact | Likelihood | Mitigation |
|---|---|---|---|
| Elixir production deployment hits unknown-unknowns | High | Medium | Week 2 decision gate; Laravel as Plan B safety net |
| External MySQL (admin_central) connection unstable | High | Low | Validate Week 1; dual-connection pool with Postgres primary; read-only fallback to cached snapshot if outage |
| AI provider (Bedrock) outage during go-live | Medium | Medium | Failover to Vertex AU (ADR-0010, AI-07); manual review fallback for critical stages |
| AI costs exceed budget during go-live bake | Medium | Low | Per-org budget circuit breaker (AI-08); prompt caching enforced (AI-09); dry-run scoring on scale tests |
| Admin Central write-back corrupts v2 data | Very High | Low | Write-back is additive only (never UPDATE, never DELETE v2 rows Finnest didn't create); dry-run on staging; audit all writes |
| Scout ported features lose behavioural parity with Laravel | High | Medium | Use Laravel's 84 feature tests as functional spec; write Elixir equivalent tests; side-by-side comparison during staging |
| Agent chat is too unreliable for production use | Medium | Medium | Fallback to navigation on every screen (D13 three-paths); don't gate any workflow behind the agent |
| Candidate assessment token abuse / enumeration | Medium | Low | Rate limit per IP + per token (SE-05); signed tokens; single-use enforcement |
| IRAP scope creep into go-live (someone adds IRAP features now) | High | Medium | Out-of-scope explicit; feature flags default off; IR-* guardrails only active in :irap env |
10. Open Questions¶
| # | Question | Owner | Decision needed by |
|---|---|---|---|
| PRD-SV-01 | Typeform SyncTypeformQuestions / SyncTypeformResponses — do any ASG orgs still require Typeform compatibility, or can we cleanly drop? |
Product + ASG | Week 2 |
| PRD-SV-02 | DVS gateway provider (greenID / Equifax / Sterling) — is it ordered for go-live, or deferred? | Product | Week 3 (Phase 0 end) |
| PRD-SV-03 | Calendly carry-over — keep exact current integration or simplify for go-live? | Product + Ops | Week 3 |
| PRD-SV-04 | Admin Central sync: do we need to backfill placements made during Weeks 1–8 back to v2, or is sync only for go-forward? | Product + v2 Ops | Week 5 |
| PRD-SV-05 | FIDO2 hardware-key procurement for Ops and Operations Manager personas before go-live? | Ops | Week 6 |
| PRD-SV-06 | Who signs off on the Week 8+72h go/no-go decision? (probably Gautham + ASG stakeholder — confirm) | Commercial | Week 7 |
| PRD-SV-07 | Sentry SaaS vs self-hosted for commercial (main architecture says Sentry SaaS commercial; confirm) | Ops | Week 4 |
11. References¶
../architecture/architecture.md— main architecture../architecture/agents.md— agent infrastructure (informs Epics 8, 7)../architecture/data.md— data model (all schemas referenced in epics)../42-COMMANDMENTS.md— philosophy../10-GUARDRAILS.md— enforcement rules../brainstorms/brainstorm-03-ai-agent-design.md— Verify 4-agent pipeline origin../brainstorms/brainstorm-11-traffio-laravel-migration-naming.mdTopic 2 — go-live approach decision../brainstorms/brainstorm-12-competitor-feature-audit.md— gap features referenced (Scout H2/H3/M4/M5/M6, Verify C3/C4/M7/M8)../adrs/adr-001-F-elixir-phoenix-primary-stack.md../adrs/adr-004-F-mcp-at-every-domain-boundary.md../adrs/adr-005-F-event-driven-cross-domain-communication.md../adrs/adr-010-F-strangler-fig-migration-from-v2.md- Laravel source (de-facto spec):
AgenticAI-app/app/Scout/*,app/Verify/*,tests/Feature/Scout/*,tests/Feature/Verify/*
Revision History¶
| Version | Date | Author | Changes |
|---|---|---|---|
| 1.0 | 2026-04-16 | Gautham Chellappa | Initial PRD for Scout + Verify go-live; Finnest-lean format; generated via BMAD /prd (informed by /architecture output) |
| 1.1 | 2026-04-16 | Gautham Chellappa | Post cross-check corrections: (a) Typeform explicitly dropped from stack (native AssessmentForm is sole assessment; Typeform migration utility for historical reference only) — E2.5 AC7; (b) Calendly adapter added to external dependencies table (§6) — port from Laravel CalendlyService + CalendlyApiService, STORY-014; © Demo-mode seeders added to E3.4 AC4 — port from STORY-132 / STORY-133; (d) CrimeCheck + VEVO integrations added to dependencies table — port from STORY-107. Laravel AgenticAI-app confirmed ~91% complete (154/169 scheduled stories); all Scout + Verify functional work shipped; only ops/infra stories (APM, CSP hardening, log aggregation, SAST, uptime monitoring, etc.) remain not_started. |
| 1.3 | 2026-04-17 | Gautham Chellappa | Open-issue re-evaluation. Applied 5 of 8 items left open after the deep cross-check: (a) Office Import command added to E1.6 AC3 Admin Console (ported from Laravel STORY-346); (b) Accessibility audit added as explicit cutover gate in §7 — axe-core CI run + manual screen-reader pass + keyboard-only navigation verified; © Observe mode explained fully in E2.6a AC10 — use cases, admin toggle, observe log surface; (d) Template admin UI added as E1.6 AC9 (Templates panel: CRUD, versioning, live preview, rollback, test-send) and AC10 (Observe Mode log panel); (e) User deletion cascade mechanics fully specified in E1.7 AC8 — anonymisation not hard-delete, PII purge rules per schema family, event-store immutability preserved, legally-retained records flagged, downstream cascade to Admin Central within 24h. Declined (tracked as future-evaluation): unified Ops Manager dashboard (covered by agent chat + E7.1 + domain dashboards); data quality CLI (covered by Tier-3 Data Quality Agent); new Epic 11 Data Import (concerns distributed across E1.6, E2.4, E3.4, E10 rather than grouped). |
| 1.2 | 2026-04-17 | Gautham Chellappa | Deep cross-check additions (46 omissions from 154 completed Laravel stories identified, 18 applied): Must-add — M365 SSO added to E1.3 AC6 (STORY-022); per-office product access expanded in E1.4 + E1.6 (STORY-109); new E1.6 Admin Console unifying org/user/office/product/flags/audit UI (STORY-028/070/071/088/089/130); new E1.7 Legal & Privacy covering ToS, Privacy Policy, consent collection, DSAR (STORY-007/151/152/153); Verify pipeline concurrency locking added to E4.2 AC7-8 (STORY-147); OTP rate-limit specifics expanded in E4.5 AC3-7 (STORY-103); document retention mechanics added to E5.5 AC6-7 (STORY-105); new E5.6 Scout audit trails — symmetric with Verify (STORY-130); new Epic 9 Security/Privacy hardening covering IDOR, rate limiting, hidden-field encryption, log PII scrubbing, consent gate (STORY-140/142/143/144 + systemic privacy enforcement). Should-add — E2.6 split into E2.6a (SMS/email with templating + observe mode, STORY-016/038/040/041) and E2.6b (Calendly-backed interview scheduling, STORY-014); new E2.12 Pool Analytics — on-hold filter, waterfall, selection badges, bucket breakdown (STORY-078/079/080/081); E4.6 expanded from "Verify demo link only" to full Demo & Sandbox Mode across Scout + Verify (STORY-131–136); E4.7 AI budget alert thresholds fully specified (STORY-148); E4.8 provider failover chain fully specified per stage (STORY-149); E5.3 field override audit detail + confidence badges expanded (STORY-106). Deferred — Guided tours (STORY-022/023/024/025/026) NOT ported; agent-as-tutor per brainstorm-09 UI vision replaces them (Commandment #4). Epic renumbering: new Security/Privacy epic slots as Epic 9; existing Operational Readiness renumbered to Epic 10. |