Skip to content

STORY-F-011: Kamal deploy.finnest.integration.yml + first integration deploy

Epic: Infrastructure Priority: Must Have Story Points: 2 Status: Not Started Assigned To: Unassigned Created: 2026-04-17 Sprint: 3


User Story

As an Ops SRE, I want Kamal deploying Finnest alongside the existing Laravel AgenticAI-app on the integration host via co-deploy pattern (ADR-014-F), with Postgres 17 as a Kamal accessory and Finnest reachable at integration-finnest.agentic-ai.au, so that every merge to main auto-deploys to integration, validating the full build→deploy pipeline before Scout + Verify go-live.


Description

Background

ADR-014-F commits to co-deploy: Laravel and Finnest share the integration host. This story provisions the integration host with a Postgres accessory (alongside existing MySQL + Redis), the Finnest app container, a new Caddy virtualhost with WebSocket upgrade config, and a Route 53 subdomain record.

This is the first real Finnest deployment. All prior sprints ran locally or in CI. Week 2 decision gate will have already validated V2Repo MyXQL; this story validates the end-to-end deploy pipeline. (Award interpretation is no longer a decision-gate concern — see ADR-016-F for the shift off KeyPay to v2-port-plus-native-engine.)

Scope

In scope:

  • Terraform update to infra/modules/app-host: add finnest_enabled: bool variable; when true, allocate Postgres security group rules (5432 localhost only), extra EBS volume space (+20 GB for Postgres data), Finnest S3 bucket + IAM role
  • Terraform apply to integration env: provisions Postgres SG + bucket + Route 53 A record integration-finnest.agentic-ai.au → 52.63.151.102
  • Kamal config config/deploy.finnest.integration.yml:
  • Service name: finnest-app (distinct from agenticai-app)
  • Registry: same ECR or Docker Hub used by Laravel; tag pattern finnest:<git-sha>
  • Servers: same integration host (52.63.151.102), but new container network namespace
  • Proxy: Caddy shared with Laravel — add virtualhost block for integration-finnest.agentic-ai.au with WebSocket upgrade for /live/websocket + /socket/*
  • Accessories: postgres (Postgres 17 image, port 5432 exposed to finnest-app container only, data volume mounted)
  • Env clear: FINNEST_ENV=commercial, PHX_HOST=integration-finnest.agentic-ai.au, PORT=4000
  • Env secrets (pulled via bws at deploy time; names match the finnest-integration Bitwarden project seeded in F-005 + F-007): FINNEST_DB_PASSWORD, FINNEST_SECRET_KEY_BASE, FINNEST_CLOAK_KEY_V1, ANTHROPIC_API_KEY, FINNEST_V2_CENTRAL_URL, FINNEST_V2_ATSLIVE_URL, FINNEST_V2_SSL, FINNEST_V2_CACERTFILE, SMSGLOBAL_API_KEY, SMSGLOBAL_API_SECRET
    • KeyPay secret intentionally absentADR-016-F (2026-04-18) supersedes ADR-009-F; award interpretation is v2 port (Phase 1) + native engine over FWC MAPD (Phase 2), no commercial SaaS in the path.
  • Healthcheck: path /ready, interval 10s, timeout 5s (IN-08)
  • scripts/deploy-finnest.sh (skeleton from F-005) fleshed out: ./scripts/deploy-finnest.sh integration uses bws to fetch secrets, runs kamal deploy -d integration -c config/deploy.finnest.integration.yml
  • Caddy virtualhost update to integration host's Caddyfile: add integration-finnest.agentic-ai.au block with reverse_proxy to localhost:4000, WebSocket upgrade headers, SSL via Let's Encrypt
  • GitHub Actions deploy job in ci-finnest.yml: after all checks pass on merge to main, invoke scripts/deploy-finnest.sh integration (using bastion deploy runner)
  • First deploy: merge a trivial PR to main after everything above is in place; confirm curl https://integration-finnest.agentic-ai.au/health returns 200

Out of scope:

  • Staging + production deployment (F-020)
  • IRAP environment (Phase 3)
  • PgBouncer connection pooler (Phase 3 per ADR-014-F)
  • Postgres read replica (Phase 2)

Technical Notes

  • Per ADR-014-F co-deploy: existing Laravel containers on host keep running; Finnest is additive. Use distinct Docker network namespace to prevent accidental cross-talk (finnest-net vs agenticai-app-net)
  • Postgres accessory name finnest-postgres (NOT agenticai-app-mysql). Port 5432 (MySQL is 3306 — no conflict)
  • Caddy Let's Encrypt: new subdomain needs cert issuance on first deploy. Caddy handles automatically via ACME challenge.
  • Route 53 update: use Terraform aws_route53_record resource; DNS propagation typically <2 min for a new record
  • bws deploy-time secret fetch: bws secret get $ID --access-token "$BWS_ACCESS_TOKEN" --output json per secret, write to tmpfile, pass to Kamal via env: file, cleanup after deploy (carry-forward AgenticAI-app pattern)
  • First-deploy should include a trivial migration (ecto.migrate) so Postgres schema exists — coordinate with F-007 migrations completing first
  • RAM budget per ADR-014-F: ~1 GB for Finnest BEAM. Monitor via CloudWatch; t3.medium integration host has ~1.5 GB free after Laravel + Postgres. TIGHT. Upgrade trigger documented in ADR.

Dependencies

  • Blocked by: STORY-F-005 (Bitwarden + CI workflow), STORY-F-007 (migrations to run), STORY-F-008 (tenant enforcement ready for live data), STORY-F-009 (V2Repo creds in secrets)

Acceptance Criteria

  • Terraform finnest_enabled: true applied to integration env; Postgres SG + S3 bucket + Route 53 record provisioned
  • integration-finnest.agentic-ai.au resolves to 52.63.151.102
  • config/deploy.finnest.integration.yml exists and is syntactically valid (kamal config -d integration -c config/deploy.finnest.integration.yml parses)
  • scripts/deploy-finnest.sh integration dry-run (--dry-run) prints planned action without invoking deploy
  • scripts/deploy-finnest.sh integration actual run succeeds: Finnest container starts, migrations run, /ready returns 200
  • curl https://integration-finnest.agentic-ai.au/health returns 200 + JSON body
  • curl https://integration-finnest.agentic-ai.au/ returns the F-004 landing page (pure Tailwind v4; DaisyUI wiring deferred to F-017)
  • SSL cert valid (Let's Encrypt issued on first deploy; curl -I shows 200, not cert error)
  • Laravel AgenticAI-app still serving at integration.agentic-ai.au — co-deploy not regressed
  • GitHub Actions deploy job triggers on merge to main; deploys integration; notifies in #finnest-deploys Slack (if configured) or GitHub Actions summary
  • Host RAM utilisation after co-deploy: <85% (ADR-014-F threshold). If >85%, open upgrade ticket to t3.large.
  • Rollback works: ./scripts/deploy-finnest.sh integration --rollback (or equivalent kamal rollback) returns to previous container

Testing Requirements

  • Integration: live deploy to integration host; automated smoke test hits /health + / + a LiveView route
  • Rollback test: deliberately break a deploy (e.g. bad migration), rollback, confirm previous version serving
  • Co-deploy regression: before + after Finnest deploy, hit Laravel integration.agentic-ai.au/up and confirm Laravel still responds

References


Post-sprint close-out (2026-04-18)

Sprint 3 closed F-011 at 2/2 pts with Phase A merged (PR #31) — deploy scripts, CI job, Kamal configs, Caddy vhost template landed on main. Sprint 3 retrospective flagged Phase B (Terraform + host vhost) + Phase C (first real deploy + smoke + auto-on-main) as "parked" for follow-up session. That follow-up happened 2026-04-18:

Phase B — DONE. Material drift found vs original plan:

  • Terraform module is modules/agenticai (not modules/app-host). Resources: aws_instance.app, aws_security_group.app, aws_eip.app with aws_eip_association.app. Integration env at infrastructure/integration/main.tf declares environment = "dev" (directory name vs environment var mismatch is pre-existing; full dev → integration rename is separate work).
  • DNS is on Cloudflare, not Route 53. integration-finnest. agentic-ai.au A record pre-existed pointing to the shared EIP. No Route 53 resources added.
  • Integration host runs kamal-proxy 2.x, not Caddy. F-011's config/deploy/caddy/integration-finnest.conf template never applied (host had no Caddy install); instead Finnest's Kamal config switched to proxy.ssl: true + proxy.host: integration-finnest.agentic-ai.au so the existing shared kamal-proxy container routes Let's Encrypt TLS automatically. Caddy template file deleted. See finnest repo commits b08d24d + 1f9c3c5.
  • AgenticAI-planning infra commit f43be02: module updated with finnest_enabled + finnest_env_label vars; aws_s3_bucket. finnest_backups (named finnest-integration-backups) + IAM user + access key provisioned via terraform apply to AWS 145770591531. No SSH ingress rules for Postgres (Docker network handles that). No EBS data volume (30 GB root is sufficient at integration scale).
  • Bitwarden finnest-integration project populated to 17 keys: added DATABASE_URL, POSTGRES_PASSWORD, KAMAL_REGISTRY_USER, KAMAL_REGISTRY_PASSWORD (both copied from agenticai-Integration project), FINNEST_V2_SSL = "false", FINNEST_V2_CACERTFILE (placeholder until SSL enabled), FINNEST_S3_BACKUPS_BUCKET/ ACCESS_KEY_ID/SECRET_ACCESS_KEY (from terraform output). Stale KEYPAY_API_KEY deleted (ADR-016-F). BWS_FINNEST_TOKEN repo secret granted read access to the finnest-integration project.
  • Kamal 2.x compat — two schema errors (top-level network: removed in 2.x; <%= ENV[...] %> ERB expansion runs before .kamal/ secrets is sourced) fixed in PR #38 (merged as 1f9c3c5).

Phase C — DEFERRED to F-020. The deploy-integration workflow now runs end-to-end through the Bitwarden fetch, Docker registry login, kamal-proxy registration, AND hits the builder step — then fails with Missing Dockerfile. The Dockerfile was always a F-020 deliverable (per this story's own deploy.yml comment: "Dockerfile.prod lands with F-020 (staging + production build pipeline)"). Writing it now without proper planning/testing is scope creep; deferring the first real deploy (plus smoke + auto-on-main flip) to F-020 where the release image pipeline lives.

Follow-up work captured in F-020: - Multi-stage Elixir 1.18 / OTP 27 release Dockerfile (umbrella + Phoenix 1.8 asset pipeline — esbuild + tailwind — + ex_release config) - First integration deploy end-to-end - Smoke suite (scripts/deploy-finnest-smoke.sh integration) - Flip ci-finnest.yml deploy-integration gate to auto-on-main - Post-deploy host RAM / capacity check (ADR-014-F §Capacity)