new-site/infra
justin 4276adab80 infra(mail): fix warmed sending IPs dropping off ens18 on reboot (Jun 24 outage)
Unattended kernel-upgrade reboot (Jun 24 04:04) left only .71 bound because
classic ifupdown applies just the first 'address' line. Postfix then failed to
bind .94/.107 ('Cannot assign requested address') and silently egressed from
.71 -- which is NOT in SPF (every fallback msg failed SPF) and is on RLR621 +
Trend ERS-QIL. ~37h of bypassed IP-warming + a near-zero sales day.

Fixes:
- /etc/network/interfaces: explicit up/down ip-addr hooks for .72/.94/.107
- pw-mail-ips.service: systemd oneshot re-binds IPs + flushes queue on boot
- pw-mail-ip-watchdog: */5 cron re-binds missing IPs + flushes, also catches
  'Cannot assign' bind failures
- runbook: full incident writeup + reboot-test lesson

Host already remediated live; this commits the host artifacts + docs.
2026-06-25 17:28:33 -05:00
..
ansible docs+infra(deliverability): document bulk subdomain; ansible signs send.performancewest.net 2026-06-18 23:12:05 -05:00
cron fix(monitoring): repair both dead mail-alert crons + de-noise DMARC digest 2026-06-24 06:28:50 -05:00
fail2ban Initial commit — Performance West telecom compliance platform 2026-04-27 06:54:22 -05:00
firewall firewall: allow ezstorehost (207.174.124.51) to reach Forgejo SSH 2026-06-10 22:45:43 -05:00
k8s infra/k8s: shkeeper liveness+readiness probes (fix recurring crypto.performancewest.net downtime) 2026-06-09 04:57:50 -05:00
mail infra(mail): fix warmed sending IPs dropping off ens18 on reboot (Jun 24 outage) 2026-06-25 17:28:33 -05:00
monitoring fix(monitoring): repair both dead mail-alert crons + de-noise DMARC digest 2026-06-24 06:28:50 -05:00
mta-sts infra: MTA-STS HTTPS vhost (cert issued, policy live) 2026-06-06 21:03:30 -05:00
network infra(mail): remove 18 dormant snowshoe IPs from postfix + host 2026-06-23 23:45:41 -05:00
nginx fix(nginx): unblock public API routes powering lead tools/flows (HC sales killer) 2026-06-23 15:51:30 -05:00
postfix infra(mail): remove 18 dormant snowshoe IPs from postfix + host 2026-06-23 23:45:41 -05:00
systemd infra: codify the email-campaign pipeline in Ansible (new mail-pipeline role) 2026-06-17 20:26:01 -05:00