Brainstorm Session 4: Data Model & Migration Strategy¶
Date: 2026-04-15 Objective: Design data model for 18 modules on PostgreSQL + migration from ASG Central v2 Depends on: Sessions 1-3 (Elixir, 20 OTP apps, agent system)
Techniques Used¶
- Starbursting — Who/what/where/when/why/how for every data decision
- SWOT Analysis — Assessment of migration strategies
- Reverse Brainstorming — "How would migration guarantee data loss?"
Data Model Overview¶
Schema Structure (19 domain schemas + infrastructure)¶
| Schema | Domain | Est. Tables | Key Entities |
|---|---|---|---|
| public | Core foundation | ~10 | organisations, users, offices, feature_flags, audit_log, notifications, tags |
| people | HR Core | ~10 | employees, employee_qualifications, leave_requests, leave_balances, contracts, unavailability |
| recruit | Recruitment | ~12 | job_orders, candidate_pools, candidate_assessments, scoring_results, outreach_log, talent_pools, drip_campaigns |
| onboard | Onboarding | ~14 | onboarding_profiles, document_verifications, verifiable_documents, verification_stages, extracted_fields, identity_checklists, tfn_declarations, super_choices, bank_details |
| roster | Rostering | ~12 | shifts, shift_templates, shift_assignments, shift_codes, roster_rules, availability, demand_forecasts, shift_swaps, shift_bids |
| timekeep | Time & Attendance | ~10 | timecards, timecard_entries, timecard_allowances, clock_events, terminals, geo_locations, breaks, journals |
| reach | Communication | ~8 | conversations, messages, message_deliveries, channels, templates, handoff_records, news_feed, announcements |
| pulse | Operations Automation | ~7 | automation_runs, automation_results, optimisation_runs, anomaly_detections, predictive_models, dashboards |
| payroll | Payroll & Invoicing | ~15 | pay_runs, pay_lines, tax_calculations, super_calculations, deductions, stp_submissions, invoices, payments, rate_cards |
| clients | Client Management | ~10 | clients, contacts, sites, agencies, rate_cards, rate_configurations, job_costings, client_hierarchies, opportunities |
| safety | Safety Management | ~12 | incidents, investigations, hazards, risk_assessments, inspections, swms_documents, corrective_actions, drug_alcohol_tests, custom_forms |
| assets | Asset Management | ~10 | equipment, equipment_assignments, maintenance_schedules, work_orders, fleet_vehicles, prestart_checklists |
| quotes | Quote Management | ~8 | leads, quotes, jobs, tasks, milestones, agreements, signatures |
| learn | LMS | ~8 | courses, tutors, enrollments, progress, certifications, learning_paths, reinduction_schedules |
| benefits | eBenefits | ~6 | programs, enrollments, perks, recognitions, rewards_points, ewa_transactions |
| compliance | Compliance Engine | ~12 | credential_types, industry_profiles, awards, award_classifications, award_rates, compliance_rules, ongoing_monitors, compliance_scores |
| fatigue | Fatigue Management | ~6 | fatigue_rules, driver_hours, rest_periods, ewd_entries, fitness_assessments |
| clearance | Security Clearance | ~5 | clearance_applications, clearance_statuses, vetting_records, program_separations |
| performance | Performance Management (NEW) | ~10 | goals, reviews, feedback, surveys, enps_scores, calibrations, development_plans, skills_matrix |
| events | Event Store | 1 | domain_events (append-only, partitioned by month) |
| agents | Agent Infrastructure | ~4 | agent_sessions, agent_messages, agent_memories, ai_budget_limits |
| Total | ~190 |
Design Principles¶
- Every table has org_id — multi-tenancy enforced at data level
- Every table has inserted_at, updated_at — standard timestamps
- Soft deletes via deleted_at — compliance requirement (never truly delete)
- UUIDs for all primary keys — no auto-increment (distributed-ready)
- JSONB for extensible fields — industry-specific data as metadata columns
- Typed Ecto embeds for structured JSON — validated at application level
- No cross-schema JOINs for operational queries — events for cross-domain
- Reference data readable cross-schema — credential_types, awards, industry_profiles
Core Entity: public.organisations¶
CREATE TABLE public.organisations (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name VARCHAR NOT NULL,
slug VARCHAR UNIQUE NOT NULL,
abn VARCHAR, -- Australian Business Number
industry_profile_id UUID, -- → compliance.industry_profiles
subscription_tier VARCHAR DEFAULT 'tier1', -- pricing tier
settings JSONB DEFAULT '{}', -- org-specific config
irap_classified BOOLEAN DEFAULT false, -- IRAP deployment flag
inserted_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
deleted_at TIMESTAMPTZ -- soft delete
);
Core Entity: people.employees¶
CREATE TABLE people.employees (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
org_id UUID NOT NULL REFERENCES public.organisations(id),
user_id UUID REFERENCES public.users(id), -- nullable (not all employees are users)
employee_number VARCHAR,
first_name VARCHAR NOT NULL,
last_name VARCHAR NOT NULL,
preferred_name VARCHAR,
date_of_birth DATE,
gender VARCHAR,
email VARCHAR,
phone VARCHAR,
address JSONB,
emergency_contact JSONB,
tax_file_number_encrypted BYTEA, -- AES-256 encrypted
employment_type VARCHAR NOT NULL, -- casual/part_time/full_time/contractor
employment_status VARCHAR NOT NULL DEFAULT 'onboarding',
start_date DATE,
termination_date DATE,
primary_office_id UUID REFERENCES public.offices(id),
primary_client_id UUID, -- → clients.clients (cross-schema ref)
metadata JSONB DEFAULT '{}', -- industry-specific extended fields
inserted_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
deleted_at TIMESTAMPTZ
);
CREATE INDEX idx_employees_org ON people.employees(org_id);
CREATE INDEX idx_employees_status ON people.employees(org_id, employment_status);
CREATE INDEX idx_employees_deleted ON people.employees(deleted_at) WHERE deleted_at IS NULL;
Event Store¶
CREATE TABLE events.domain_events (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
domain VARCHAR(50) NOT NULL,
event_type VARCHAR(100) NOT NULL,
aggregate_id UUID,
org_id UUID NOT NULL,
payload JSONB NOT NULL,
metadata JSONB DEFAULT '{}', -- correlation_id, causation_id, actor, agent_id
inserted_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
) PARTITION BY RANGE (inserted_at);
-- Monthly partitions
CREATE TABLE events.domain_events_2026_04 PARTITION OF events.domain_events
FOR VALUES FROM ('2026-04-01') TO ('2026-05-01');
-- Indexes
CREATE INDEX idx_events_org_domain ON events.domain_events(org_id, domain, event_type);
CREATE INDEX idx_events_aggregate ON events.domain_events(aggregate_id);
CREATE INDEX idx_events_correlation ON events.domain_events((metadata->>'correlation_id'));
-- Immutability trigger
CREATE OR REPLACE FUNCTION events.prevent_modification()
RETURNS TRIGGER AS $$
BEGIN
RAISE EXCEPTION 'Events are immutable. UPDATE and DELETE are not allowed.';
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER no_modify_events
BEFORE UPDATE OR DELETE ON events.domain_events
FOR EACH ROW EXECUTE FUNCTION events.prevent_modification();
v2 → Finnest Table Mapping¶
Complete Cross-Reference¶
| v2 Table | v2 DB | Finnest Schema.Table | Transform Notes |
|---|---|---|---|
| candidates | admin_central | people.employees | INT→UUID, split address to JSONB, map status codes |
| candidates_additional | admin_central | people.employees.metadata | Merge into JSONB metadata column |
| candidates_comments | admin_central | people.employee_notes | New table |
| candidates_referencecheck | admin_central | recruit.reference_checks | |
| candidates_qualification | admin_central | people.employee_qualifications | Map to credential_types |
| admin_users | admin_central | public.users (role: admin) | Merge 3 user tables into one |
| consultants | admin_central | public.users (role: staff) | Merge |
| payroll_staffs | admin_central | public.users (role: payroll) | Merge |
| roles | admin_central | public.roles | |
| office_locations | admin_central | public.offices | |
| clients | admin_central | clients.clients | |
| client_contacts | admin_central | clients.contacts | |
| site | admin_central | clients.sites | |
| agency | admin_central | clients.agencies | |
| joborders | admin_central | roster.shifts | Significant restructure |
| roster_codes | admin_central | roster.shift_codes | |
| roster_rule_set | admin_central | roster.roster_rules | |
| timesheets | admin_central | timekeep.timecards | |
| timesheets_allowances | admin_central | timekeep.timecard_allowances | |
| pay_details | admin_central | payroll.pay_details | |
| pay_levels | admin_central | payroll.pay_levels | |
| allowances | admin_central | payroll.allowances | |
| super_fund_details | admin_central | payroll.super_funds | |
| leave_request | admin_central | people.leave_requests | |
| unavailability | admin_central | people.unavailability | |
| onboarding_profile | admin_central | onboard.onboarding_profiles | |
| candidate_compliance | admin_central | compliance.compliance_status | |
| functional_assessment | admin_central | safety.assessments | |
| clients_appraisal | admin_central | people.appraisals | |
| contract_details | admin_central | people.contracts | |
| clientmessaging_* | admin_central | reach.conversations + messages | |
| message_log | admin_central | reach.message_log | |
| Clock data | actatek | timekeep.clock_events | Timezone conversion critical |
| Applicant data | admin_atslive | recruit.candidate_pools |
Migration Strategy: Strangler Fig (Domain-by-Domain)¶
Phase 1: Foundation + Reference Data¶
- Migrate: organisations, offices, users (merge 3 tables), tags, skills
- Create: compliance.credential_types, compliance.industry_profiles
- Create: id_mappings table
- Duration: 1 week
- Risk: Low
Phase 2: People Domain¶
- Migrate: employees, qualifications, leave, contracts, appraisals
- Validate: employee count matches, qualification dates correct
- v2 remains source of truth for people — Finnest is read-copy
- Duration: 2 weeks (including data quality audit)
- Risk: Low-Medium
Phase 3: Clients Domain¶
- Migrate: clients, contacts, sites, agencies
- Create: rate_cards, job_costings (new)
- Duration: 1 week
- Risk: Low
Phase 4: Roster Domain¶
- Migrate: joborders → shifts, roster_codes, roster_rules
- Significant transformation (v2 structure → Finnest structure)
- Duration: 2 weeks
- Risk: Medium
Phase 5: Timekeep Domain¶
- Migrate: timesheets → timecards, clock events from actatek
- Timezone conversion validation critical
- Tiered: hot (6 months), warm (6mo-2yr), cold (2yr+)
- Duration: 2 weeks
- Risk: Medium
Phase 6: Messaging & Historical¶
- Migrate: clientmessaging → reach.conversations
- Archive: old message history, reports
- Duration: 1 week
- Risk: Low
Phase 7: Payroll Domain (LAST)¶
- Migrate: pay_details, pay_levels, allowances, super_funds
- Must align with pay cycle boundary
- Human validation required
- Duration: 2 weeks + validation period
- Risk: High (financial data)
Phase 8: Cutover + Decommission¶
- v2 set to read-only
- 3-month parallel access period
- Decommission after no v2 reads for 30 days
Total Timeline: ~12-16 weeks (parallel with platform development)¶
Data Quality Audit (Pre-Migration)¶
Before any migration, audit v2 data for:
| Check | Query | Action if Found |
|---|---|---|
| Orphaned employees | Employees with no org | Assign to default org or skip |
| Duplicate employees | Same name + DOB + email | Merge with human review |
| Invalid status | Active + terminated_date set | Fix status or clear date |
| Missing required fields | Employee with no name | Flag for human completion |
| Orphaned timesheets | Timesheets with no employee | Log and skip |
| Stale data | Employees not accessed in 3+ years | Migrate to cold archive |
| Encoding issues | Non-UTF8 characters | Convert or sanitize |
Budget: 1-2 weeks before migration starts.
Key Insights¶
Insight 1: Strangler Fig Aligns with Commercial Phasing¶
Plan A phases (P1→P2→P3) and migration phases can be the same. Each delivery migrates its domains. No separate "migration project." Impact: High | Effort: Medium
Insight 2: ID Mapping Table Is the Migration Backbone¶
Every v2 INT ID → Finnest UUID. Every script resolves through this mapping. Build first, decommission last. Impact: High | Effort: Low
Insight 3: Pre-Migration Data Quality Audit Prevents 80% of Bugs¶
v2 has no FK constraints, no tests, 10+ years of data. Orphans and inconsistencies are guaranteed. Impact: High | Effort: Medium (1-2 weeks)
Insight 4: Tiered Migration (Hot/Warm/Cold) Solves Volume¶
Last 6 months migrated in cutover window. 6mo-2yr in background. 2yr+ to archive. Multi-day becomes multi-hour. Impact: High | Effort: Low
Insight 5: Compliance Schema Is the Multi-Industry Foundation¶
credential_types, industry_profiles, awards are referenced by 8+ other schemas. Design first. Impact: High | Effort: Medium
Insight 6: New Clients Need No Migration — Just Import¶
Data import wizard (CSV/Excel) is both a migration tool and a sales tool. Design from day one. Impact: Medium | Effort: Low
Statistics¶
- Total ideas: 30+
- Categories: 4 (Data Model, Migration Strategy, Data Quality, Cross-Reference)
- Key insights: 6
- Techniques applied: 3
- Estimated tables: ~150 across 18 schemas
- Migration phases: 8
- Migration timeline: 12-16 weeks
Recommended Next Steps¶
→ Session 5: Multi-Industry Design (compliance schema design, industry profiles, credential registry) → Design compliance.credential_types and compliance.industry_profiles schemas in detail → Run data quality audit on v2 admin_central database → Build id_mappings prototype
Generated by BMAD Method v6 - Creative Intelligence