new-site/infra
justin ae68edbc58 fix(monitoring): repair both dead mail-alert crons + de-noise DMARC digest
Three bugs the owner hit:
1. Per-operator reputation alert (06:10 cron, mail_reputation_monitor --alert)
   silently never ran: it redirected to /var/log/pw-mail-reputation.log but
   /var/log is root-only and that file was never pre-created, so the deploy
   user's >> redirect failed and cron aborted before the command. Repointed
   both mail-alert crons to deploy-writable /opt/performancewest/logs/.
2. IP reputation alert (20:00 cron) still referenced the removed rehab pool
   (.91-.93) and used 8.8.8.8 for Spamhaus (which returns the open-resolver
   error, not a real answer). Dropped the rehab section, relabeled to the two
   live IPs (.94/.107), and switched the DNSBL check to Control D (76.76.2.0)
   which returns real Spamhaus ZEN data. (It was correctly SILENT lately
   because delivery is healthy -- silent-on-healthy is by design.)
3. DMARC daily digest was pure noise: it alerted on ANY external IP with >=20
   failing msgs, but those are legit recipient-side forwarders/security
   gateways (inkyphishfence, cloud-sec-av, Proofpoint, Mimecast, ...) that
   re-send our mail and naturally break SPF/DKIM alignment -- benign under
   p=reject. Added PTR-based forwarder detection (FORWARDER_PTR_HINTS) so the
   digest tags them [fwd] and only alerts on (a) OUR IP <95% pass or (b) an
   UNKNOWN non-forwarder external IP with >=100 failing msgs = real spoofing.

Verified: all 4 currently-flagged external IPs now classify as forwarder=True.
2026-06-24 06:28:50 -05:00
..
ansible docs+infra(deliverability): document bulk subdomain; ansible signs send.performancewest.net 2026-06-18 23:12:05 -05:00
cron fix(monitoring): repair both dead mail-alert crons + de-noise DMARC digest 2026-06-24 06:28:50 -05:00
fail2ban Initial commit — Performance West telecom compliance platform 2026-04-27 06:54:22 -05:00
firewall firewall: allow ezstorehost (207.174.124.51) to reach Forgejo SSH 2026-06-10 22:45:43 -05:00
k8s infra/k8s: shkeeper liveness+readiness probes (fix recurring crypto.performancewest.net downtime) 2026-06-09 04:57:50 -05:00
monitoring fix(monitoring): repair both dead mail-alert crons + de-noise DMARC digest 2026-06-24 06:28:50 -05:00
mta-sts infra: MTA-STS HTTPS vhost (cert issued, policy live) 2026-06-06 21:03:30 -05:00
network infra(mail): remove 18 dormant snowshoe IPs from postfix + host 2026-06-23 23:45:41 -05:00
nginx fix(nginx): unblock public API routes powering lead tools/flows (HC sales killer) 2026-06-23 15:51:30 -05:00
postfix infra(mail): remove 18 dormant snowshoe IPs from postfix + host 2026-06-23 23:45:41 -05:00
systemd infra: codify the email-campaign pipeline in Ansible (new mail-pipeline role) 2026-06-17 20:26:01 -05:00