Skip to content

Brainstorm Session 4: Data Model & Migration Strategy

Date: 2026-04-15 Objective: Design data model for 18 modules on PostgreSQL + migration from ASG Central v2 Depends on: Sessions 1-3 (Elixir, 20 OTP apps, agent system)

Techniques Used

  1. Starbursting — Who/what/where/when/why/how for every data decision
  2. SWOT Analysis — Assessment of migration strategies
  3. Reverse Brainstorming — "How would migration guarantee data loss?"

Data Model Overview

Schema Structure (19 domain schemas + infrastructure)

Schema Domain Est. Tables Key Entities
public Core foundation ~10 organisations, users, offices, feature_flags, audit_log, notifications, tags
people HR Core ~10 employees, employee_qualifications, leave_requests, leave_balances, contracts, unavailability
recruit Recruitment ~12 job_orders, candidate_pools, candidate_assessments, scoring_results, outreach_log, talent_pools, drip_campaigns
onboard Onboarding ~14 onboarding_profiles, document_verifications, verifiable_documents, verification_stages, extracted_fields, identity_checklists, tfn_declarations, super_choices, bank_details
roster Rostering ~12 shifts, shift_templates, shift_assignments, shift_codes, roster_rules, availability, demand_forecasts, shift_swaps, shift_bids
timekeep Time & Attendance ~10 timecards, timecard_entries, timecard_allowances, clock_events, terminals, geo_locations, breaks, journals
reach Communication ~8 conversations, messages, message_deliveries, channels, templates, handoff_records, news_feed, announcements
pulse Operations Automation ~7 automation_runs, automation_results, optimisation_runs, anomaly_detections, predictive_models, dashboards
payroll Payroll & Invoicing ~15 pay_runs, pay_lines, tax_calculations, super_calculations, deductions, stp_submissions, invoices, payments, rate_cards
clients Client Management ~10 clients, contacts, sites, agencies, rate_cards, rate_configurations, job_costings, client_hierarchies, opportunities
safety Safety Management ~12 incidents, investigations, hazards, risk_assessments, inspections, swms_documents, corrective_actions, drug_alcohol_tests, custom_forms
assets Asset Management ~10 equipment, equipment_assignments, maintenance_schedules, work_orders, fleet_vehicles, prestart_checklists
quotes Quote Management ~8 leads, quotes, jobs, tasks, milestones, agreements, signatures
learn LMS ~8 courses, tutors, enrollments, progress, certifications, learning_paths, reinduction_schedules
benefits eBenefits ~6 programs, enrollments, perks, recognitions, rewards_points, ewa_transactions
compliance Compliance Engine ~12 credential_types, industry_profiles, awards, award_classifications, award_rates, compliance_rules, ongoing_monitors, compliance_scores
fatigue Fatigue Management ~6 fatigue_rules, driver_hours, rest_periods, ewd_entries, fitness_assessments
clearance Security Clearance ~5 clearance_applications, clearance_statuses, vetting_records, program_separations
performance Performance Management (NEW) ~10 goals, reviews, feedback, surveys, enps_scores, calibrations, development_plans, skills_matrix
events Event Store 1 domain_events (append-only, partitioned by month)
agents Agent Infrastructure ~4 agent_sessions, agent_messages, agent_memories, ai_budget_limits
Total ~190

Design Principles

  1. Every table has org_id — multi-tenancy enforced at data level
  2. Every table has inserted_at, updated_at — standard timestamps
  3. Soft deletes via deleted_at — compliance requirement (never truly delete)
  4. UUIDs for all primary keys — no auto-increment (distributed-ready)
  5. JSONB for extensible fields — industry-specific data as metadata columns
  6. Typed Ecto embeds for structured JSON — validated at application level
  7. No cross-schema JOINs for operational queries — events for cross-domain
  8. Reference data readable cross-schema — credential_types, awards, industry_profiles

Core Entity: public.organisations

CREATE TABLE public.organisations (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  name VARCHAR NOT NULL,
  slug VARCHAR UNIQUE NOT NULL,
  abn VARCHAR,                              -- Australian Business Number
  industry_profile_id UUID,                 -- → compliance.industry_profiles
  subscription_tier VARCHAR DEFAULT 'tier1', -- pricing tier
  settings JSONB DEFAULT '{}',              -- org-specific config
  irap_classified BOOLEAN DEFAULT false,    -- IRAP deployment flag
  inserted_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  deleted_at TIMESTAMPTZ                    -- soft delete
);

Core Entity: people.employees

CREATE TABLE people.employees (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  org_id UUID NOT NULL REFERENCES public.organisations(id),
  user_id UUID REFERENCES public.users(id), -- nullable (not all employees are users)
  employee_number VARCHAR,
  first_name VARCHAR NOT NULL,
  last_name VARCHAR NOT NULL,
  preferred_name VARCHAR,
  date_of_birth DATE,
  gender VARCHAR,
  email VARCHAR,
  phone VARCHAR,
  address JSONB,
  emergency_contact JSONB,
  tax_file_number_encrypted BYTEA,          -- AES-256 encrypted
  employment_type VARCHAR NOT NULL,          -- casual/part_time/full_time/contractor
  employment_status VARCHAR NOT NULL DEFAULT 'onboarding',
  start_date DATE,
  termination_date DATE,
  primary_office_id UUID REFERENCES public.offices(id),
  primary_client_id UUID,                   -- → clients.clients (cross-schema ref)
  metadata JSONB DEFAULT '{}',              -- industry-specific extended fields
  inserted_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  deleted_at TIMESTAMPTZ
);

CREATE INDEX idx_employees_org ON people.employees(org_id);
CREATE INDEX idx_employees_status ON people.employees(org_id, employment_status);
CREATE INDEX idx_employees_deleted ON people.employees(deleted_at) WHERE deleted_at IS NULL;

Event Store

CREATE TABLE events.domain_events (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  domain VARCHAR(50) NOT NULL,
  event_type VARCHAR(100) NOT NULL,
  aggregate_id UUID,
  org_id UUID NOT NULL,
  payload JSONB NOT NULL,
  metadata JSONB DEFAULT '{}',   -- correlation_id, causation_id, actor, agent_id
  inserted_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
) PARTITION BY RANGE (inserted_at);

-- Monthly partitions
CREATE TABLE events.domain_events_2026_04 PARTITION OF events.domain_events
  FOR VALUES FROM ('2026-04-01') TO ('2026-05-01');

-- Indexes
CREATE INDEX idx_events_org_domain ON events.domain_events(org_id, domain, event_type);
CREATE INDEX idx_events_aggregate ON events.domain_events(aggregate_id);
CREATE INDEX idx_events_correlation ON events.domain_events((metadata->>'correlation_id'));

-- Immutability trigger
CREATE OR REPLACE FUNCTION events.prevent_modification()
RETURNS TRIGGER AS $$
BEGIN
  RAISE EXCEPTION 'Events are immutable. UPDATE and DELETE are not allowed.';
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER no_modify_events
  BEFORE UPDATE OR DELETE ON events.domain_events
  FOR EACH ROW EXECUTE FUNCTION events.prevent_modification();

v2 → Finnest Table Mapping

Complete Cross-Reference

v2 Table v2 DB Finnest Schema.Table Transform Notes
candidates admin_central people.employees INT→UUID, split address to JSONB, map status codes
candidates_additional admin_central people.employees.metadata Merge into JSONB metadata column
candidates_comments admin_central people.employee_notes New table
candidates_referencecheck admin_central recruit.reference_checks
candidates_qualification admin_central people.employee_qualifications Map to credential_types
admin_users admin_central public.users (role: admin) Merge 3 user tables into one
consultants admin_central public.users (role: staff) Merge
payroll_staffs admin_central public.users (role: payroll) Merge
roles admin_central public.roles
office_locations admin_central public.offices
clients admin_central clients.clients
client_contacts admin_central clients.contacts
site admin_central clients.sites
agency admin_central clients.agencies
joborders admin_central roster.shifts Significant restructure
roster_codes admin_central roster.shift_codes
roster_rule_set admin_central roster.roster_rules
timesheets admin_central timekeep.timecards
timesheets_allowances admin_central timekeep.timecard_allowances
pay_details admin_central payroll.pay_details
pay_levels admin_central payroll.pay_levels
allowances admin_central payroll.allowances
super_fund_details admin_central payroll.super_funds
leave_request admin_central people.leave_requests
unavailability admin_central people.unavailability
onboarding_profile admin_central onboard.onboarding_profiles
candidate_compliance admin_central compliance.compliance_status
functional_assessment admin_central safety.assessments
clients_appraisal admin_central people.appraisals
contract_details admin_central people.contracts
clientmessaging_* admin_central reach.conversations + messages
message_log admin_central reach.message_log
Clock data actatek timekeep.clock_events Timezone conversion critical
Applicant data admin_atslive recruit.candidate_pools

Migration Strategy: Strangler Fig (Domain-by-Domain)

Phase 1: Foundation + Reference Data

  • Migrate: organisations, offices, users (merge 3 tables), tags, skills
  • Create: compliance.credential_types, compliance.industry_profiles
  • Create: id_mappings table
  • Duration: 1 week
  • Risk: Low

Phase 2: People Domain

  • Migrate: employees, qualifications, leave, contracts, appraisals
  • Validate: employee count matches, qualification dates correct
  • v2 remains source of truth for people — Finnest is read-copy
  • Duration: 2 weeks (including data quality audit)
  • Risk: Low-Medium

Phase 3: Clients Domain

  • Migrate: clients, contacts, sites, agencies
  • Create: rate_cards, job_costings (new)
  • Duration: 1 week
  • Risk: Low

Phase 4: Roster Domain

  • Migrate: joborders → shifts, roster_codes, roster_rules
  • Significant transformation (v2 structure → Finnest structure)
  • Duration: 2 weeks
  • Risk: Medium

Phase 5: Timekeep Domain

  • Migrate: timesheets → timecards, clock events from actatek
  • Timezone conversion validation critical
  • Tiered: hot (6 months), warm (6mo-2yr), cold (2yr+)
  • Duration: 2 weeks
  • Risk: Medium

Phase 6: Messaging & Historical

  • Migrate: clientmessaging → reach.conversations
  • Archive: old message history, reports
  • Duration: 1 week
  • Risk: Low

Phase 7: Payroll Domain (LAST)

  • Migrate: pay_details, pay_levels, allowances, super_funds
  • Must align with pay cycle boundary
  • Human validation required
  • Duration: 2 weeks + validation period
  • Risk: High (financial data)

Phase 8: Cutover + Decommission

  • v2 set to read-only
  • 3-month parallel access period
  • Decommission after no v2 reads for 30 days

Total Timeline: ~12-16 weeks (parallel with platform development)


Data Quality Audit (Pre-Migration)

Before any migration, audit v2 data for:

Check Query Action if Found
Orphaned employees Employees with no org Assign to default org or skip
Duplicate employees Same name + DOB + email Merge with human review
Invalid status Active + terminated_date set Fix status or clear date
Missing required fields Employee with no name Flag for human completion
Orphaned timesheets Timesheets with no employee Log and skip
Stale data Employees not accessed in 3+ years Migrate to cold archive
Encoding issues Non-UTF8 characters Convert or sanitize

Budget: 1-2 weeks before migration starts.


Key Insights

Insight 1: Strangler Fig Aligns with Commercial Phasing

Plan A phases (P1→P2→P3) and migration phases can be the same. Each delivery migrates its domains. No separate "migration project." Impact: High | Effort: Medium

Insight 2: ID Mapping Table Is the Migration Backbone

Every v2 INT ID → Finnest UUID. Every script resolves through this mapping. Build first, decommission last. Impact: High | Effort: Low

Insight 3: Pre-Migration Data Quality Audit Prevents 80% of Bugs

v2 has no FK constraints, no tests, 10+ years of data. Orphans and inconsistencies are guaranteed. Impact: High | Effort: Medium (1-2 weeks)

Insight 4: Tiered Migration (Hot/Warm/Cold) Solves Volume

Last 6 months migrated in cutover window. 6mo-2yr in background. 2yr+ to archive. Multi-day becomes multi-hour. Impact: High | Effort: Low

Insight 5: Compliance Schema Is the Multi-Industry Foundation

credential_types, industry_profiles, awards are referenced by 8+ other schemas. Design first. Impact: High | Effort: Medium

Insight 6: New Clients Need No Migration — Just Import

Data import wizard (CSV/Excel) is both a migration tool and a sales tool. Design from day one. Impact: Medium | Effort: Low


Statistics

  • Total ideas: 30+
  • Categories: 4 (Data Model, Migration Strategy, Data Quality, Cross-Reference)
  • Key insights: 6
  • Techniques applied: 3
  • Estimated tables: ~150 across 18 schemas
  • Migration phases: 8
  • Migration timeline: 12-16 weeks

→ Session 5: Multi-Industry Design (compliance schema design, industry profiles, credential registry) → Design compliance.credential_types and compliance.industry_profiles schemas in detail → Run data quality audit on v2 admin_central database → Build id_mappings prototype


Generated by BMAD Method v6 - Creative Intelligence