Runbook — Disaster Recovery

Disaster recovery restores the platform in a separate region or cluster after a regional or cluster-level incident. This is broader than application rollback.

Preconditions

DR region or cluster is provisioned.
Backup artifacts are reachable from the DR environment.
DNS, certificate, and customer connectivity changes are approved.
Incident commander owns the cutover decision.

Steps

Provision or validate the DR namespace and infrastructure dependencies.
Restore PostgreSQL from the selected recovery point.
Deploy Finnest Power with ingress disabled.
Run internal smoke tests and tenant sanity checks.
Repoint DNS or gateway routing after approval.
Monitor traffic, errors, consent flows, and payment readiness.

Verification

Recovery point objective and recovery time objective are recorded.
Core APIs, consent lifecycle, and payment initiation readiness checks pass.
Observability backend receives logs/traces from the DR deployment.
A follow-up task captures reconciliation work for NATS/outbox events after the recovery point.