Skip to content

ADR-005-F: Event-Driven Cross-Domain Communication

Status: Accepted Date: 2026-04-16 Decision Makers: Gautham Chellappa Depends on: ADR-002-F (Modular Monolith) Related: ADR-011-F (Compliance auto-blocking — the one synchronous exception)

Context

ADR-002-F established 21 OTP applications that must not call each other directly. That rule needs a communication mechanism — something every domain can publish to and any interested domain can subscribe to, without creating direct dependencies.

The options are: shared database tables (tight coupling to schema shape), direct RPC between BEAM apps (re-creates direct dependencies), external message bus like Kafka (operational overhead too high for 3-person team), or an events pattern using an append-only table plus Phoenix.PubSub.

Decision

Cross-domain communication is events-only (AR-07).

Event storage

All domain events appended to events.domain_events — single table in dedicated events schema. Append-only via DB trigger (AR-17). Hash-chained for tamper evidence (SE-21, IR-10). Partitioned by month (DA-14). See data.md for full schema.

Event flow

Domain A's command
  └─ Repo.transaction fn ->
       Repo.insert(domain_entity)              -- domain write
       EventStore.append(%Event{...})          -- event write (trigger fires)
     end
  └─ Phoenix.PubSub.broadcast("org:#{org_id}:#{domain}", event)
      └─ subscribers in other domains react
          └─ Oban.insert(ReactionWorker, ...)  -- idempotent, retryable

Subscription pattern

Each domain has event_handlers.ex that declares subscriptions and routes events to Oban workers:

defmodule Finnest.Pulse.EventHandlers do
  use Finnest.Core.EventSubscriber

  subscribe :recruit, :job_order_created, via: Finnest.Pulse.Workers.DraftRosterTemplate
  subscribe :onboard, :verification_completed, via: Finnest.Pulse.Workers.UpdateComplianceScore
  # ... etc
end

The one exception — Compliance.check/2

Events are asynchronous — they cannot block a write. When a write must be gated (e.g. "don't roster a non-credentialled worker"), a synchronous read is required. Compliance.check/2 is the single documented exception (ADR-011-F).

Rules

  1. No direct function calls between domain apps (enforced by Boundary library, AR-08)
  2. No cross-schema database queries for operational data (reference data exception documented in data.md)
  3. All cross-domain data flows through typed events
  4. Events are persisted (audit trail + replay)
  5. Event handlers are idempotent (QJ-01, Commandment #38)
  6. Correlation + causation IDs track chains — max 10 events per chain (AI-05)

Alternatives Considered

Alternative Rejected because
Direct function calls between domains Creates tight coupling; any refactor in domain A can break domain B; violates AR-07
Shared DB tables between domains Ties domains to each other's schema evolution; cross-schema JOINs in operational queries
Kafka / RabbitMQ external bus Operational overhead (cluster to run, maintain, back up); at our scale PubSub + Postgres is sufficient until 100K+ employees (Phase 3 trigger)
RESTful HTTP between BEAM apps in one node Serialisation overhead for zero benefit within one node; reinvents MCP transport
No cross-domain communication (make every domain standalone) Impossible — a job order → roster template → compliance check → billing line is inherently cross-domain

Consequences

Positive:

  • Loose coupling — domain A's internal changes don't break domain B's subscribers
  • Audit trail for free (event store is the audit log, B02 Insight 3)
  • Replayable — reconstruct any aggregate state by replaying events
  • Data flywheel — historical events drive anomaly detection, Tier-3 agents, L3 memory
  • Event subscribers can evolve independently — new subscriber added without touching publisher
  • Idempotent workers are naturally retry-safe

Negative:

  • Eventual consistency — subscribers lag the publisher
  • Debugging cross-domain bugs needs correlation-ID tracing across event chains
  • Developers must learn to think in events, not in method calls
  • Idempotency discipline required on every subscriber (QJ-01)

Tipping points for re-evaluation:

  • Event rate exceeds PostgreSQL's comfortable write throughput (~5M events/day per B02) → introduce Kafka/NATS as the bus, keep event store as audit (D12 phase 4)
  • A second synchronous exception emerges — revisit this ADR rather than quietly add one. The single exception is load-bearing discipline.

Relationship to Guardrails

Enforces: AR-07 (events-only cross-domain), AR-17 (event store immutability), AR-08 (Boundary library enforcement), AI-05 (correlation + causation chain limit), QJ-01 (idempotent workers).