new-site/docs/deliverability.md
justin 5253f16675 docs: deliverability runbook (incident, IP consolidation, monitoring setup)
Documents the 2026-06-18 reputation incident (snowshoe -> Gmail domain-rep
blocks, RBLs all clean), the single-IP-per-stream consolidation, and
fill-in-the-blanks setup steps for Google Postmaster Tools, Microsoft SNDS/JMRP,
and Yahoo CFL (all require owner account login + HE.net DNS). Plus ongoing
hygiene + how to re-expand IPs once reputation recovers.
2026-06-18 17:46:28 -05:00

5.7 KiB

Email Deliverability Runbook

Owner action items are marked 🔴 MANUAL. Everything else is already done/automated.

Last updated: 2026-06-18 (IP consolidation + monitoring-tools setup).


TL;DR of the 2026-06-18 deliverability incident

  • Symptom: ~30% "open" rates but 0 human clicks, 0 sales across both trucking and healthcare streams.
  • Root cause: NOT a blocklist. Swept all 21 sending IPs against ~40 RBLs (Spamhaus via authoritative NS, Barracuda, SpamCop, SORBS, UCEPROTECT L1/2/3, Mailspike, SpamRATS, etc.) -> every IP clean. The real problem was domain reputation: Gmail rejected ~150 msgs/day with 550-5.7.1 ... very low reputation of the sending domain. We were snowshoeing ~3k trucking msgs/day across 12 IPs + ~1.2k healthcare across 3 IPs, so no single IP sent enough per-receiver volume to build reputation. This rotation was a band-aid for the broken DKIM (fixed 2026-06-17) and the May 30-31 over-volume blast.
  • Fix applied: consolidated to ONE IP per stream (below) so each accrues real reputation now that DKIM signs correctly.

Sending architecture (after 2026-06-18 consolidation)

Stream IP PTR / HELO Path
Trucking (listmonk) 207.174.124.94 mta05.performancewest.net listmonk -> :25 -> randmap:{out05:}
Healthcare (listmonk-hc) 207.174.124.107 hcmta01.performancewest.net listmonk-hc SMTP server 1 -> :2526 -> hcout1
Yahoo/AOL trickle 207.174.124.90 mta01 yahooslow transport (hash:transport)
Transactional 207.174.124.71 perfwest default smtp_bind_address
Retired (torched May 30-31) .91 / .92 / .93 mta02-04 rehab02-04 (reputation rebuild only)
Dormant (re-expand later) .95-.105, .108-.109 mta06-17, hcmta02-03 disabled

To re-expand after reputation is established: add transports back to ALL=() in infra/postfix/pw-mta-warmup.sh and re-enable the HC SMTP servers (ports 2527/2528) in the listmonk_hc DB settings.smtp. Re-expand SLOWLY (one IP at a time, days apart) and only after Postmaster Tools shows a green/medium reputation.

SPF authorizes the whole .71/.90-.109 set already — harmless, gives flexibility.


Monitoring tools (set these up to SEE reputation directly)

These all require a provider account login + (for Google) a DNS TXT record on HE.net, so they can't be fully automated. Steps are pre-filled below.

🔴 MANUAL 1 — Google Postmaster Tools (Gmail is our biggest blocker)

Gmail's verbatim rejection names "the sending domain", so this is priority #1.

  1. Go to https://postmaster.google.com and sign in with any Google account.
  2. Click + (Add domain) -> enter performancewest.net.
  3. Google shows a TXT record like google-site-verification=XXXXXXXX.
  4. Add it at HE.net DNS (dns.he.net -> performancewest.net zone):
    • Type: TXT, Name: @ (apex), Value: the full google-site-verification=... string. (This coexists with the existing SPF TXT — multiple TXT records on the apex are fine.)
  5. Wait ~15 min for propagation, then click Verify in Postmaster Tools.
  6. Data (Domain Reputation, IP Reputation, Spam Rate, Auth pass %, Feedback Loop) starts populating in 24-48h once volume flows from the consolidated IP.

🔴 MANUAL 2 — Microsoft SNDS + JMRP (Outlook/Hotmail/Live)

SNDS is IP-based (register the sending IPs), JMRP is the complaint feedback loop.

  1. SNDS: https://sendersupport.olc.protection.outlook.com/snds/ -> "Request access" -> register IPs: 207.174.124.94 and 207.174.124.107 (the two live stream IPs; add .90 and .71 if you want full coverage). Verification goes to a role address on the IP's domain — use postmaster@performancewest.net or abuse@performancewest.net (ensure one of those receives mail via carrierone).
  2. JMRP: https://sendersupport.olc.protection.outlook.com/pm/ -> sign in with a Microsoft account -> register the same IPs + a complaint-destination mailbox (e.g. fbl@performancewest.net). Complaints then arrive as ARF emails.

🔴 MANUAL 3 — Yahoo Complaint Feedback Loop (Yahoo/AOL + att/sbcglobal/verizon)

  1. https://senders.yahooinc.com/complaint-feedback-loop/ -> sign in -> register the domain performancewest.net (CFL is DKIM-d= based, so it covers all our IPs automatically since they all sign with the same mail._domainkey).
  2. Set the complaint destination to fbl@performancewest.net.

AUTOMATABLE LATER — DMARC aggregate reports (all providers, free)

Gmail/Yahoo/Microsoft already send daily per-IP auth+disposition XML to dmarc@performancewest.net (our DMARC record has rua=mailto:dmarc@...). Nobody parses them yet. If we add IMAP creds for that mailbox (it's on carrierone MX) we can build a small collector/parser worker to chart per-IP pass/fail without any provider login. Deferred — provider dashboards above are faster to stand up.


Ongoing hygiene (reduce reputation damage)

  • Dead-address scrub: ~110 genuine 5.1.1 user unknown bounces/day. listmonk already blocklists hard bounces after 1 (bounce.actions hard->blocklist), so these self-clean, but pre-scrubbing the dirtiest segments before send avoids the reputation hit. See data/ segment exports.
  • Don't re-expand IPs until Postmaster Tools shows recovered reputation.
  • Volume discipline: keep the global 200/hr sliding window until reputation is green; concentrated low volume on one warm IP beats bursts.
  • Watch the rejection mix: 5.7.1 reputation/spam/blocked should fall over the next 1-2 weeks as the single-IP reputation builds. Track via: ssh ... 'sudo grep status=bounced /var/log/mail.log | grep -c 5.7.1'