diff --git a/docs/deliverability.md b/docs/deliverability.md new file mode 100644 index 0000000..98cad2a --- /dev/null +++ b/docs/deliverability.md @@ -0,0 +1,102 @@ +# Email Deliverability Runbook + +**Owner action items are marked 🔴 MANUAL. Everything else is already done/automated.** + +Last updated: 2026-06-18 (IP consolidation + monitoring-tools setup). + +--- + +## TL;DR of the 2026-06-18 deliverability incident + +- **Symptom:** ~30% "open" rates but **0 human clicks, 0 sales** across both trucking + and healthcare streams. +- **Root cause:** NOT a blocklist. Swept all 21 sending IPs against ~40 RBLs + (Spamhaus via authoritative NS, Barracuda, SpamCop, SORBS, UCEPROTECT L1/2/3, + Mailspike, SpamRATS, etc.) -> **every IP clean.** The real problem was + **domain reputation**: Gmail rejected ~150 msgs/day with + `550-5.7.1 ... very low reputation of the sending domain`. We were + **snowshoeing** ~3k trucking msgs/day across 12 IPs + ~1.2k healthcare across + 3 IPs, so no single IP sent enough per-receiver volume to build reputation. + This rotation was a band-aid for the **broken DKIM** (fixed 2026-06-17) and the + May 30-31 over-volume blast. +- **Fix applied:** consolidated to ONE IP per stream (below) so each accrues real + reputation now that DKIM signs correctly. + +--- + +## Sending architecture (after 2026-06-18 consolidation) + +| Stream | IP | PTR / HELO | Path | +|--------|----|-----------|----| +| **Trucking** (listmonk) | **207.174.124.94** | mta05.performancewest.net | listmonk -> :25 -> `randmap:{out05:}` | +| **Healthcare** (listmonk-hc) | **207.174.124.107** | hcmta01.performancewest.net | listmonk-hc SMTP server 1 -> :2526 -> hcout1 | +| Yahoo/AOL trickle | 207.174.124.90 | mta01 | `yahooslow` transport (hash:transport) | +| Transactional | 207.174.124.71 | perfwest | default `smtp_bind_address` | +| Retired (torched May 30-31) | .91 / .92 / .93 | mta02-04 | rehab02-04 (reputation rebuild only) | +| Dormant (re-expand later) | .95-.105, .108-.109 | mta06-17, hcmta02-03 | disabled | + +**To re-expand after reputation is established:** add transports back to `ALL=()` +in `infra/postfix/pw-mta-warmup.sh` and re-enable the HC SMTP servers (ports +2527/2528) in the `listmonk_hc` DB `settings.smtp`. Re-expand SLOWLY (one IP at a +time, days apart) and only after Postmaster Tools shows a green/medium reputation. + +SPF authorizes the whole `.71/.90-.109` set already — harmless, gives flexibility. + +--- + +## Monitoring tools (set these up to SEE reputation directly) + +These all require a provider account login + (for Google) a DNS TXT record on +HE.net, so they can't be fully automated. Steps are pre-filled below. + +### 🔴 MANUAL 1 — Google Postmaster Tools (Gmail is our biggest blocker) +Gmail's verbatim rejection names "the sending **domain**", so this is priority #1. +1. Go to and sign in with any Google account. +2. Click **+ (Add domain)** -> enter `performancewest.net`. +3. Google shows a **TXT record** like `google-site-verification=XXXXXXXX`. +4. Add it at **HE.net DNS** (dns.he.net -> performancewest.net zone): + - Type: `TXT`, Name: `@` (apex), Value: the full `google-site-verification=...` + string. (This coexists with the existing SPF TXT — multiple TXT records on + the apex are fine.) +5. Wait ~15 min for propagation, then click **Verify** in Postmaster Tools. +6. Data (Domain Reputation, IP Reputation, Spam Rate, Auth pass %, Feedback Loop) + starts populating in 24-48h once volume flows from the consolidated IP. + +### 🔴 MANUAL 2 — Microsoft SNDS + JMRP (Outlook/Hotmail/Live) +SNDS is **IP-based** (register the sending IPs), JMRP is the complaint feedback loop. +1. **SNDS:** -> "Request + access" -> register IPs: **207.174.124.94** and **207.174.124.107** (the two + live stream IPs; add .90 and .71 if you want full coverage). Verification goes + to a role address on the IP's domain — use `postmaster@performancewest.net` or + `abuse@performancewest.net` (ensure one of those receives mail via carrierone). +2. **JMRP:** -> sign in with + a Microsoft account -> register the same IPs + a complaint-destination mailbox + (e.g. `fbl@performancewest.net`). Complaints then arrive as ARF emails. + +### 🔴 MANUAL 3 — Yahoo Complaint Feedback Loop (Yahoo/AOL + att/sbcglobal/verizon) +1. -> sign in -> register + the domain `performancewest.net` (CFL is DKIM-d= based, so it covers all our + IPs automatically since they all sign with the same `mail._domainkey`). +2. Set the complaint destination to `fbl@performancewest.net`. + +### ✅ AUTOMATABLE LATER — DMARC aggregate reports (all providers, free) +Gmail/Yahoo/Microsoft already send daily per-IP auth+disposition XML to +`dmarc@performancewest.net` (our DMARC record has `rua=mailto:dmarc@...`). Nobody +parses them yet. If we add IMAP creds for that mailbox (it's on carrierone MX) we +can build a small collector/parser worker to chart per-IP pass/fail without any +provider login. Deferred — provider dashboards above are faster to stand up. + +--- + +## Ongoing hygiene (reduce reputation damage) + +- **Dead-address scrub:** ~110 genuine `5.1.1 user unknown` bounces/day. listmonk + already blocklists hard bounces after 1 (`bounce.actions hard->blocklist`), so + these self-clean, but pre-scrubbing the dirtiest segments before send avoids the + reputation hit. See `data/` segment exports. +- **Don't re-expand IPs** until Postmaster Tools shows recovered reputation. +- **Volume discipline:** keep the global 200/hr sliding window until reputation is + green; concentrated low volume on one warm IP beats bursts. +- **Watch the rejection mix:** `5.7.1 reputation/spam/blocked` should fall over the + next 1-2 weeks as the single-IP reputation builds. Track via: + `ssh ... 'sudo grep status=bounced /var/log/mail.log | grep -c 5.7.1'`