Brainstorm Session 4: Data Model & Migration Strategy¶

Date: 2026-04-15 Objective: Design data model for 18 modules on PostgreSQL + migration from ASG Central v2 Depends on: Sessions 1-3 (Elixir, 20 OTP apps, agent system)

Techniques Used¶

Starbursting — Who/what/where/when/why/how for every data decision
SWOT Analysis — Assessment of migration strategies
Reverse Brainstorming — "How would migration guarantee data loss?"

Data Model Overview¶

Schema Structure (19 domain schemas + infrastructure)¶

Schema	Domain	Est. Tables	Key Entities
public	Core foundation	~10	organisations, users, offices, feature_flags, audit_log, notifications, tags
people	HR Core	~10	employees, employee_qualifications, leave_requests, leave_balances, contracts, unavailability
recruit	Recruitment	~12	job_orders, candidate_pools, candidate_assessments, scoring_results, outreach_log, talent_pools, drip_campaigns
onboard	Onboarding	~14	onboarding_profiles, document_verifications, verifiable_documents, verification_stages, extracted_fields, identity_checklists, tfn_declarations, super_choices, bank_details
roster	Rostering	~12	shifts, shift_templates, shift_assignments, shift_codes, roster_rules, availability, demand_forecasts, shift_swaps, shift_bids
timekeep	Time & Attendance	~10	timecards, timecard_entries, timecard_allowances, clock_events, terminals, geo_locations, breaks, journals
reach	Communication	~8	conversations, messages, message_deliveries, channels, templates, handoff_records, news_feed, announcements
pulse	Operations Automation	~7	automation_runs, automation_results, optimisation_runs, anomaly_detections, predictive_models, dashboards
payroll	Payroll & Invoicing	~15	pay_runs, pay_lines, tax_calculations, super_calculations, deductions, stp_submissions, invoices, payments, rate_cards
clients	Client Management	~10	clients, contacts, sites, agencies, rate_cards, rate_configurations, job_costings, client_hierarchies, opportunities
safety	Safety Management	~12	incidents, investigations, hazards, risk_assessments, inspections, swms_documents, corrective_actions, drug_alcohol_tests, custom_forms
assets	Asset Management	~10	equipment, equipment_assignments, maintenance_schedules, work_orders, fleet_vehicles, prestart_checklists
quotes	Quote Management	~8	leads, quotes, jobs, tasks, milestones, agreements, signatures
learn	LMS	~8	courses, tutors, enrollments, progress, certifications, learning_paths, reinduction_schedules
benefits	eBenefits	~6	programs, enrollments, perks, recognitions, rewards_points, ewa_transactions
compliance	Compliance Engine	~12	credential_types, industry_profiles, awards, award_classifications, award_rates, compliance_rules, ongoing_monitors, compliance_scores
fatigue	Fatigue Management	~6	fatigue_rules, driver_hours, rest_periods, ewd_entries, fitness_assessments
clearance	Security Clearance	~5	clearance_applications, clearance_statuses, vetting_records, program_separations
performance	Performance Management (NEW)	~10	goals, reviews, feedback, surveys, enps_scores, calibrations, development_plans, skills_matrix
events	Event Store	1	domain_events (append-only, partitioned by month)
agents	Agent Infrastructure	~4	agent_sessions, agent_messages, agent_memories, ai_budget_limits
Total		~190

Design Principles¶

Every table has org_id — multi-tenancy enforced at data level
Every table has inserted_at, updated_at — standard timestamps
Soft deletes via deleted_at — compliance requirement (never truly delete)
UUIDs for all primary keys — no auto-increment (distributed-ready)
JSONB for extensible fields — industry-specific data as metadata columns
Typed Ecto embeds for structured JSON — validated at application level
No cross-schema JOINs for operational queries — events for cross-domain
Reference data readable cross-schema — credential_types, awards, industry_profiles

Core Entity: public.organisations¶

CREATE TABLE public.organisations (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  name VARCHAR NOT NULL,
  slug VARCHAR UNIQUE NOT NULL,
  abn VARCHAR,                              -- Australian Business Number
  industry_profile_id UUID,                 -- → compliance.industry_profiles
  subscription_tier VARCHAR DEFAULT 'tier1', -- pricing tier
  settings JSONB DEFAULT '{}',              -- org-specific config
  irap_classified BOOLEAN DEFAULT false,    -- IRAP deployment flag
  inserted_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  deleted_at TIMESTAMPTZ                    -- soft delete
);

Core Entity: people.employees¶

CREATE TABLE people.employees (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  org_id UUID NOT NULL REFERENCES public.organisations(id),
  user_id UUID REFERENCES public.users(id), -- nullable (not all employees are users)
  employee_number VARCHAR,
  first_name VARCHAR NOT NULL,
  last_name VARCHAR NOT NULL,
  preferred_name VARCHAR,
  date_of_birth DATE,
  gender VARCHAR,
  email VARCHAR,
  phone VARCHAR,
  address JSONB,
  emergency_contact JSONB,
  tax_file_number_encrypted BYTEA,          -- AES-256 encrypted
  employment_type VARCHAR NOT NULL,          -- casual/part_time/full_time/contractor
  employment_status VARCHAR NOT NULL DEFAULT 'onboarding',
  start_date DATE,
  termination_date DATE,
  primary_office_id UUID REFERENCES public.offices(id),
  primary_client_id UUID,                   -- → clients.clients (cross-schema ref)
  metadata JSONB DEFAULT '{}',              -- industry-specific extended fields
  inserted_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  deleted_at TIMESTAMPTZ
);

CREATE INDEX idx_employees_org ON people.employees(org_id);
CREATE INDEX idx_employees_status ON people.employees(org_id, employment_status);
CREATE INDEX idx_employees_deleted ON people.employees(deleted_at) WHERE deleted_at IS NULL;

Event Store¶

CREATE TABLE events.domain_events (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  domain VARCHAR(50) NOT NULL,
  event_type VARCHAR(100) NOT NULL,
  aggregate_id UUID,
  org_id UUID NOT NULL,
  payload JSONB NOT NULL,
  metadata JSONB DEFAULT '{}',   -- correlation_id, causation_id, actor, agent_id
  inserted_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
) PARTITION BY RANGE (inserted_at);

-- Monthly partitions
CREATE TABLE events.domain_events_2026_04 PARTITION OF events.domain_events
  FOR VALUES FROM ('2026-04-01') TO ('2026-05-01');

-- Indexes
CREATE INDEX idx_events_org_domain ON events.domain_events(org_id, domain, event_type);
CREATE INDEX idx_events_aggregate ON events.domain_events(aggregate_id);
CREATE INDEX idx_events_correlation ON events.domain_events((metadata->>'correlation_id'));

-- Immutability trigger
CREATE OR REPLACE FUNCTION events.prevent_modification()
RETURNS TRIGGER AS $$
BEGIN
  RAISE EXCEPTION 'Events are immutable. UPDATE and DELETE are not allowed.';
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER no_modify_events
  BEFORE UPDATE OR DELETE ON events.domain_events
  FOR EACH ROW EXECUTE FUNCTION events.prevent_modification();

v2 → Finnest Table Mapping¶

Complete Cross-Reference¶

v2 Table	v2 DB	Finnest Schema.Table	Transform Notes
candidates	admin_central	people.employees	INT→UUID, split address to JSONB, map status codes
candidates_additional	admin_central	people.employees.metadata	Merge into JSONB metadata column
candidates_comments	admin_central	people.employee_notes	New table
candidates_referencecheck	admin_central	recruit.reference_checks
candidates_qualification	admin_central	people.employee_qualifications	Map to credential_types
admin_users	admin_central	public.users (role: admin)	Merge 3 user tables into one
consultants	admin_central	public.users (role: staff)	Merge
payroll_staffs	admin_central	public.users (role: payroll)	Merge
roles	admin_central	public.roles
office_locations	admin_central	public.offices
clients	admin_central	clients.clients
client_contacts	admin_central	clients.contacts
site	admin_central	clients.sites
agency	admin_central	clients.agencies
joborders	admin_central	roster.shifts	Significant restructure
roster_codes	admin_central	roster.shift_codes
roster_rule_set	admin_central	roster.roster_rules
timesheets	admin_central	timekeep.timecards
timesheets_allowances	admin_central	timekeep.timecard_allowances
pay_details	admin_central	payroll.pay_details
pay_levels	admin_central	payroll.pay_levels
allowances	admin_central	payroll.allowances
super_fund_details	admin_central	payroll.super_funds
leave_request	admin_central	people.leave_requests
unavailability	admin_central	people.unavailability
onboarding_profile	admin_central	onboard.onboarding_profiles
candidate_compliance	admin_central	compliance.compliance_status
functional_assessment	admin_central	safety.assessments
clients_appraisal	admin_central	people.appraisals
contract_details	admin_central	people.contracts
clientmessaging_*	admin_central	reach.conversations + messages
message_log	admin_central	reach.message_log
Clock data	actatek	timekeep.clock_events	Timezone conversion critical
Applicant data	admin_atslive	recruit.candidate_pools

Migration Strategy: Strangler Fig (Domain-by-Domain)¶

Phase 1: Foundation + Reference Data¶

Migrate: organisations, offices, users (merge 3 tables), tags, skills
Create: compliance.credential_types, compliance.industry_profiles
Create: id_mappings table
Duration: 1 week
Risk: Low

Phase 2: People Domain¶

Migrate: employees, qualifications, leave, contracts, appraisals
Validate: employee count matches, qualification dates correct
v2 remains source of truth for people — Finnest is read-copy
Duration: 2 weeks (including data quality audit)
Risk: Low-Medium

Phase 3: Clients Domain¶

Migrate: clients, contacts, sites, agencies
Create: rate_cards, job_costings (new)
Duration: 1 week
Risk: Low

Phase 4: Roster Domain¶

Migrate: joborders → shifts, roster_codes, roster_rules
Significant transformation (v2 structure → Finnest structure)
Duration: 2 weeks
Risk: Medium

Phase 5: Timekeep Domain¶

Migrate: timesheets → timecards, clock events from actatek
Timezone conversion validation critical
Tiered: hot (6 months), warm (6mo-2yr), cold (2yr+)
Duration: 2 weeks
Risk: Medium

Phase 6: Messaging & Historical¶

Migrate: clientmessaging → reach.conversations
Archive: old message history, reports
Duration: 1 week
Risk: Low

Phase 7: Payroll Domain (LAST)¶

Migrate: pay_details, pay_levels, allowances, super_funds
Must align with pay cycle boundary
Human validation required
Duration: 2 weeks + validation period
Risk: High (financial data)

Phase 8: Cutover + Decommission¶

v2 set to read-only
3-month parallel access period
Decommission after no v2 reads for 30 days

Total Timeline: ~12-16 weeks (parallel with platform development)¶

Data Quality Audit (Pre-Migration)¶

Before any migration, audit v2 data for:

Check	Query	Action if Found
Orphaned employees	Employees with no org	Assign to default org or skip
Duplicate employees	Same name + DOB + email	Merge with human review
Invalid status	Active + terminated_date set	Fix status or clear date
Missing required fields	Employee with no name	Flag for human completion
Orphaned timesheets	Timesheets with no employee	Log and skip
Stale data	Employees not accessed in 3+ years	Migrate to cold archive
Encoding issues	Non-UTF8 characters	Convert or sanitize

Budget: 1-2 weeks before migration starts.

Key Insights¶

Insight 1: Strangler Fig Aligns with Commercial Phasing¶

Plan A phases (P1→P2→P3) and migration phases can be the same. Each delivery migrates its domains. No separate "migration project." Impact: High | Effort: Medium

Insight 2: ID Mapping Table Is the Migration Backbone¶

Every v2 INT ID → Finnest UUID. Every script resolves through this mapping. Build first, decommission last. Impact: High | Effort: Low

Insight 3: Pre-Migration Data Quality Audit Prevents 80% of Bugs¶

v2 has no FK constraints, no tests, 10+ years of data. Orphans and inconsistencies are guaranteed. Impact: High | Effort: Medium (1-2 weeks)

Insight 4: Tiered Migration (Hot/Warm/Cold) Solves Volume¶

Last 6 months migrated in cutover window. 6mo-2yr in background. 2yr+ to archive. Multi-day becomes multi-hour. Impact: High | Effort: Low

Insight 5: Compliance Schema Is the Multi-Industry Foundation¶

credential_types, industry_profiles, awards are referenced by 8+ other schemas. Design first. Impact: High | Effort: Medium

Insight 6: New Clients Need No Migration — Just Import¶

Data import wizard (CSV/Excel) is both a migration tool and a sales tool. Design from day one. Impact: Medium | Effort: Low

Statistics¶

Total ideas: 30+
Categories: 4 (Data Model, Migration Strategy, Data Quality, Cross-Reference)
Key insights: 6
Techniques applied: 3
Estimated tables: ~150 across 18 schemas
Migration phases: 8
Migration timeline: 12-16 weeks

Recommended Next Steps¶

→ Session 5: Multi-Industry Design (compliance schema design, industry profiles, credential registry) → Design compliance.credential_types and compliance.industry_profiles schemas in detail → Run data quality audit on v2 admin_central database → Build id_mappings prototype

Generated by BMAD Method v6 - Creative Intelligence