DNS is fully automatable: Hestia (cp.carrierone.com, zone owner = justin user) is the DNS master, HE.net are slaves. Added google-site-verification TXT (id 14464) via v-add-dns-record as root; verified resolving on public resolvers + HE.net slaves. Owner just clicks Verify in the Postmaster console. Documents the v-add-dns-record path for future records.
108 lines
5.9 KiB
Markdown
108 lines
5.9 KiB
Markdown
# Email Deliverability Runbook
|
|
|
|
**Owner action items are marked 🔴 MANUAL. Everything else is already done/automated.**
|
|
|
|
Last updated: 2026-06-18 (IP consolidation + monitoring-tools setup).
|
|
|
|
---
|
|
|
|
## TL;DR of the 2026-06-18 deliverability incident
|
|
|
|
- **Symptom:** ~30% "open" rates but **0 human clicks, 0 sales** across both trucking
|
|
and healthcare streams.
|
|
- **Root cause:** NOT a blocklist. Swept all 21 sending IPs against ~40 RBLs
|
|
(Spamhaus via authoritative NS, Barracuda, SpamCop, SORBS, UCEPROTECT L1/2/3,
|
|
Mailspike, SpamRATS, etc.) -> **every IP clean.** The real problem was
|
|
**domain reputation**: Gmail rejected ~150 msgs/day with
|
|
`550-5.7.1 ... very low reputation of the sending domain`. We were
|
|
**snowshoeing** ~3k trucking msgs/day across 12 IPs + ~1.2k healthcare across
|
|
3 IPs, so no single IP sent enough per-receiver volume to build reputation.
|
|
This rotation was a band-aid for the **broken DKIM** (fixed 2026-06-17) and the
|
|
May 30-31 over-volume blast.
|
|
- **Fix applied:** consolidated to ONE IP per stream (below) so each accrues real
|
|
reputation now that DKIM signs correctly.
|
|
|
|
---
|
|
|
|
## Sending architecture (after 2026-06-18 consolidation)
|
|
|
|
| Stream | IP | PTR / HELO | Path |
|
|
|--------|----|-----------|----|
|
|
| **Trucking** (listmonk) | **207.174.124.94** | mta05.performancewest.net | listmonk -> :25 -> `randmap:{out05:}` |
|
|
| **Healthcare** (listmonk-hc) | **207.174.124.107** | hcmta01.performancewest.net | listmonk-hc SMTP server 1 -> :2526 -> hcout1 |
|
|
| Yahoo/AOL trickle | 207.174.124.90 | mta01 | `yahooslow` transport (hash:transport) |
|
|
| Transactional | 207.174.124.71 | perfwest | default `smtp_bind_address` |
|
|
| Retired (torched May 30-31) | .91 / .92 / .93 | mta02-04 | rehab02-04 (reputation rebuild only) |
|
|
| Dormant (re-expand later) | .95-.105, .108-.109 | mta06-17, hcmta02-03 | disabled |
|
|
|
|
**To re-expand after reputation is established:** add transports back to `ALL=()`
|
|
in `infra/postfix/pw-mta-warmup.sh` and re-enable the HC SMTP servers (ports
|
|
2527/2528) in the `listmonk_hc` DB `settings.smtp`. Re-expand SLOWLY (one IP at a
|
|
time, days apart) and only after Postmaster Tools shows a green/medium reputation.
|
|
|
|
SPF authorizes the whole `.71/.90-.109` set already — harmless, gives flexibility.
|
|
|
|
---
|
|
|
|
## Monitoring tools (set these up to SEE reputation directly)
|
|
|
|
These all require a provider account login + (for Google) a DNS TXT record on
|
|
HE.net, so they can't be fully automated. Steps are pre-filled below.
|
|
|
|
### 🔴 MANUAL 1 — Google Postmaster Tools (Gmail is our biggest blocker)
|
|
Gmail's verbatim rejection names "the sending **domain**", so this is priority #1.
|
|
|
|
**DNS is fully automatable** — Hestia (cp.carrierone.com) is the DNS master,
|
|
HE.net are slaves. Add records as root: `ssh -p 22022 root@cp.carrierone.com`
|
|
then `v-add-dns-record justin performancewest.net "@" TXT '"'"'"<value>"'"'"'`
|
|
(zone owner is the `justin` Hestia user; ~30s zone rebuild + slaves sync via the
|
|
2h SOA refresh / NOTIFY, usually within a minute).
|
|
|
|
Status 2026-06-18: **TXT added + verified live** (record id 14464,
|
|
`google-site-verification=p8s3RaN5wi81350wToMpdPMho5Gcel4RGT1Q1SXj7vg`),
|
|
resolving on 8.8.8.8/1.1.1.1/9.9.9.9 and 4/5 HE.net slaves. Owner just needs to
|
|
click **Verify** in the Postmaster console once. Data populates 24-48h after
|
|
volume flows from the consolidated IP.
|
|
|
|
To set up from scratch next time: postmaster.google.com -> +Add domain ->
|
|
performancewest.net -> copy the `google-site-verification=...` token -> add via
|
|
the Hestia command above -> Verify.
|
|
|
|
### 🔴 MANUAL 2 — Microsoft SNDS + JMRP (Outlook/Hotmail/Live)
|
|
SNDS is **IP-based** (register the sending IPs), JMRP is the complaint feedback loop.
|
|
1. **SNDS:** <https://sendersupport.olc.protection.outlook.com/snds/> -> "Request
|
|
access" -> register IPs: **207.174.124.94** and **207.174.124.107** (the two
|
|
live stream IPs; add .90 and .71 if you want full coverage). Verification goes
|
|
to a role address on the IP's domain — use `postmaster@performancewest.net` or
|
|
`abuse@performancewest.net` (ensure one of those receives mail via carrierone).
|
|
2. **JMRP:** <https://sendersupport.olc.protection.outlook.com/pm/> -> sign in with
|
|
a Microsoft account -> register the same IPs + a complaint-destination mailbox
|
|
(e.g. `fbl@performancewest.net`). Complaints then arrive as ARF emails.
|
|
|
|
### 🔴 MANUAL 3 — Yahoo Complaint Feedback Loop (Yahoo/AOL + att/sbcglobal/verizon)
|
|
1. <https://senders.yahooinc.com/complaint-feedback-loop/> -> sign in -> register
|
|
the domain `performancewest.net` (CFL is DKIM-d= based, so it covers all our
|
|
IPs automatically since they all sign with the same `mail._domainkey`).
|
|
2. Set the complaint destination to `fbl@performancewest.net`.
|
|
|
|
### ✅ AUTOMATABLE LATER — DMARC aggregate reports (all providers, free)
|
|
Gmail/Yahoo/Microsoft already send daily per-IP auth+disposition XML to
|
|
`dmarc@performancewest.net` (our DMARC record has `rua=mailto:dmarc@...`). Nobody
|
|
parses them yet. If we add IMAP creds for that mailbox (it's on carrierone MX) we
|
|
can build a small collector/parser worker to chart per-IP pass/fail without any
|
|
provider login. Deferred — provider dashboards above are faster to stand up.
|
|
|
|
---
|
|
|
|
## Ongoing hygiene (reduce reputation damage)
|
|
|
|
- **Dead-address scrub:** ~110 genuine `5.1.1 user unknown` bounces/day. listmonk
|
|
already blocklists hard bounces after 1 (`bounce.actions hard->blocklist`), so
|
|
these self-clean, but pre-scrubbing the dirtiest segments before send avoids the
|
|
reputation hit. See `data/` segment exports.
|
|
- **Don't re-expand IPs** until Postmaster Tools shows recovered reputation.
|
|
- **Volume discipline:** keep the global 200/hr sliding window until reputation is
|
|
green; concentrated low volume on one warm IP beats bursts.
|
|
- **Watch the rejection mix:** `5.7.1 reputation/spam/blocked` should fall over the
|
|
next 1-2 weeks as the single-IP reputation builds. Track via:
|
|
`ssh ... 'sudo grep status=bounced /var/log/mail.log | grep -c 5.7.1'`
|