new-site

Author	SHA1	Message	Date
justin	4276adab80	infra(mail): fix warmed sending IPs dropping off ens18 on reboot (Jun 24 outage) Unattended kernel-upgrade reboot (Jun 24 04:04) left only .71 bound because classic ifupdown applies just the first 'address' line. Postfix then failed to bind .94/.107 ('Cannot assign requested address') and silently egressed from .71 -- which is NOT in SPF (every fallback msg failed SPF) and is on RLR621 + Trend ERS-QIL. ~37h of bypassed IP-warming + a near-zero sales day. Fixes: - /etc/network/interfaces: explicit up/down ip-addr hooks for .72/.94/.107 - pw-mail-ips.service: systemd oneshot re-binds IPs + flushes queue on boot - pw-mail-ip-watchdog: */5 cron re-binds missing IPs + flushes, also catches 'Cannot assign' bind failures - runbook: full incident writeup + reboot-test lesson Host already remediated live; this commits the host artifacts + docs.	2026-06-25 17:28:33 -05:00
justin	3325259af7	fix(email): drop @TrackLink from per-subscriber CTAs (404 + collapse bug) Listmonk @TrackLink registers ONE static URL per tracked link and points every recipient's /link/<uuid> redirect at it. On per-subscriber hrefs ({{ lp_link }}, ?dot=, ?npi=, ?clia=) this is doubly broken: - the registered links.url was captured before the {{ lp_link }} token rendered, yielding /order/slug&utm_source=... (first &, no ?) -> 404 - even when valid it collapses every carrier/provider onto the first subscriber's dot/npi/clia value Real human clicks are already tracked via Umami campaign-click (bot filtered), so Listmonk link tracking here is redundant and destructive. Stripped @TrackLink from per-subscriber CTAs: - scripts/create_deficiency_source_campaigns.py (_cta, _dot_check_cta) - data/trucking_campaigns/{ucr,ifta}_.html - data/hc_campaigns/.html (10 templates) Static CTAs (e.g. CRTC ?code= order link) keep @TrackLink (safe). Live fix to the 10 broken registered links.url rows applied separately (first & -> ?), backup in listmonk.pw_links_dkim_fix_bak_20260622. Docs: new runbook incident section + corrected the disproven 'use @TrackLink on all CTAs' guidance in fmcsa/hc plans.	2026-06-22 17:01:39 -05:00
justin	62292b96af	docs(deliverability): document Jun 22 re-send of never-delivered DKIM-window audience Records the MAIN_EXCLUDE_OPERATORS=google override, the resend_dkim_backup_20260622 rollback table, the past-send_at HTTP 400 gotcha (use --send-hour for same-day re-runs), and the exact revert SQL. 6461-row backup; ~2999 re-sent Jun 22, rest drain on subsequent daily runs (Gmail excluded, Microsoft/Hotmail included).	2026-06-22 11:59:29 -05:00
justin	eba525f83f	docs: runbook fix #8 — telecom/transactional HTML-only plaintext fix + campaign 407 finding	2026-06-17 21:17:06 -05:00
justin	4171f48736	docs: record post-incident email hardening (7 fixes) in runbook	2026-06-17 20:30:59 -05:00
justin	4d5901921e	mail: fix OpenDKIM not signing campaign mail (Docker-injected) + codify in Ansible Root cause of the Jun 2026 deliverability collapse / 'no new sales': opendkim.conf was in single-key mode with no InternalHosts, so it signed only 127.0.0.1. Transactional/cron mail (injected locally) was signed, but ALL campaign mail -- injected over the Docker bridge from the Listmonk containers (172.18.0.5 trucking, 172.18.0.25 healthcare) -- went out UNSIGNED. Gmail/Yahoo require DKIM on bulk mail since Feb 2024, so cold campaigns were junked/blocked (~23% delivery, 550-5.7.1). Proof: 2,620 campaign msgs that day, 0 DKIM sigs. The correct table files already existed on the server but were never wired into opendkim.conf. Fix points the daemon at key.table/signing.table and sets InternalHosts/ExternalIgnoreList to trusted.hosts (which includes 172.16.0.0/12, the Docker subnet). Fixes BOTH streams: HC submission ports 2526-2528 inherit the global smtpd_milters and *@performancewest.net covers compliance@. Verified by injecting from a Docker IP through port 25 and port 2526 -- both now get 'DKIM-Signature field added'. Codified as new Ansible role 'mail' so it can't silently regress (OpenDKIM was previously not in IaC at all).	2026-06-17 19:31:19 -05:00
justin	8090fe0589	docs: ramp schedule + pw-listmonk-rampcap, fresh-IP day-0 send started	2026-06-02 12:32:42 -05:00
justin	98bcf0bbb0	docs: email deliverability + IP warmup runbook Document the self-hosted MTA layout, the May 30-31 reputation collapse, the Jun 02 remediation (retired burned IPs .91/.92/.93, swapped rotation to fresh .94/.95/.96, full Yahoo-family hold map, Listmonk sliding-window cap, paused the 13k-recipient blast scheduled for Jun 03), and the fresh-IP warmup rules + monitoring commands.	2026-06-02 12:25:33 -05:00

8 commits