Consolidate the outbound mail footprint to match the SPF intent (already
trimmed to .94/.107 on 2026-06-19). A 20-IP sending footprint reads as
snowshoe spam to receivers and was contributing to domain-reputation
throttling (Microsoft 451 4.7.500, Gmail low-reputation).
Removed from /etc/postfix/master.cf: transports yahooslow, out02-04,
out06-20, rehab02-04, HC submission ports 2527/2528, hcout2/hcout3.
Removed from /etc/network/interfaces (+ live ip addr del): host bindings
.90-.93, .95-.106, .108-.109. Kept: .94 (trucking/out05), .107 (HC/hcout1),
.71/.72 (infra).
Verified live: postfix check OK, both streams still status=sent post-change,
SSH session on .71 unaffected, transport_maps still routes via out05.
Snapshots: infra/postfix/live-snapshots/master.cf, infra/network/interfaces.
Live backups on server: /root/{master.cf,interfaces}.bak_snowshoe_*.
18 KiB
Email Deliverability Runbook
Owner action items are marked 🔴 MANUAL. Everything else is already done/automated.
Last updated: 2026-06-19 (bulk subdomain + SPF trim + Microsoft/audience analysis).
TL;DR of the 2026-06-18/19 deliverability incident
- Symptom: ~30% "open" rates but 0 human clicks, 0 sales across both trucking and healthcare streams.
- Root cause: NOT a blocklist, NOT the IPs. Proven by a controlled A/B test
(2026-06-19): from the same mail server / same IPs, a message From
justin@carrierone.comlanded in the Inbox while Fromjustin@performancewest.netwent to Junk. The variable is the From domain's reputation.carrierone.com(reg. 2006, years of steady low-volume mail, tight 2-IP SPF) is trusted;performancewest.net(only started bulk in ~May 2026, broken DKIM until 2026-06-17, 21-IP snowshoe SPF, May 30-31 over-volume blast) is cold/damaged. - Where the audience actually is (24h receiver mix): ~85% Microsoft
(M365/Outlook/Hotmail), ~14% Google, <1% Yahoo. Our list is B2B, so Microsoft
is the game, not Gmail. Microsoft is NOT reputation-blocking us (only ~1.6%
5.7.x/S3150 rejects; it accepts ~2,138 msgs/24h) — but acceptance != inbox, so
the engagement problem there is likely Junk-foldering, same domain-reputation
cause. Gmail rejects ~95% of its (smaller) slice on
550-5.7.1 ... very low reputation of the sending domain. The single biggest bounce bucket is actually list hygiene: ~1,012/24h Microsoft451 4.4.4 no mail-enabled subscriptions(dead tenant domains) + dead recipients. - Fixes applied (2026-06-18/19):
- Consolidated to ONE IP per stream (snowshoe was a band-aid for broken DKIM).
- Dedicated bulk subdomain
send.performancewest.netso bulk reputation is isolated from the root domain (which stays clean for transactional mail). - Trimmed root SPF from 21 IPs to the real 3 (the bloated record was itself a snowshoe signal).
- Disabled the pointless
pw-ip-rehabcron (we have no IP reputation problem).
Bulk subdomain: send.performancewest.net (2026-06-19)
Why: isolate bulk/cold-campaign sending reputation from the root domain. The root domain carries transactional/verification/receipt mail (via co.carrierone.com relay + the .71 default egress) and must stay clean; cold campaigns are inherently reputation-risky. Industry-standard (SendGrid/Mailchimp/etc.) split.
Customer experience is unchanged: From is the subdomain, but Reply-To stays
info@performancewest.net, so replies land in the real inbox and look normal.
| Piece | Value |
|---|---|
| Trucking From | Performance West <noreply@send.performancewest.net> |
| Healthcare From | Performance West Compliance <compliance@send.performancewest.net> |
| Reply-To (both) | info@performancewest.net |
| DKIM selector | send (send._domainkey.send.performancewest.net), 2048-bit |
| SPF | v=spf1 ip4:207.174.124.94 ip4:207.174.124.107 -all |
| DMARC | inherits root p=reject (explicit _dmarc.send also published) |
| MX / Return-Path | co.carrierone.com (bounces) |
| Egress IPs | .94 (trucking) / .107 (HC) — unchanged |
Code: from_email is set in scripts/build_trucking_campaigns.py (FROM_EMAIL,
env CAMPAIGN_FROM) and scripts/build_healthcare_campaigns_cron.py (FROM_EMAIL,
env HC_CAMPAIGN_FROM). Bounce-watchers (scripts/bounce-watcher.sh,
scripts/hc-bounce-watcher.sh) track the new subdomain sender (and keep the legacy
root sender so the pre-cutover queue drains).
Infra: OpenDKIM signs both domains — see infra/ansible/roles/mail
(opendkim_signing_domains list generates per-domain keys + KeyTable/SigningTable).
DNS published on the Hestia master (see DNS automation note below). Verified
end-to-end 2026-06-19: a test send signs d=send.performancewest.net; s=send; and
egresses out05/.94.
Listmonk global app.from_email was also updated in both DBs as a fallback for
any UI/test send that doesn't set From explicitly.
⚠️ The subdomain starts at NEUTRAL reputation (not negative, not warm). It still needs the same warm-up discipline: steady low volume to engaged recipients. It is NOT a magic reset — but it protects the root domain and starts cleaner than the damaged root.
Sending architecture (after 2026-06-18/19 consolidation)
| Stream | IP | PTR / HELO | Path |
|---|---|---|---|
| Trucking (listmonk) | 207.174.124.94 | mta05.performancewest.net | listmonk -> :25 -> randmap:{out05:} |
| Healthcare (listmonk-hc) | 207.174.124.107 | hcmta01.performancewest.net | listmonk-hc SMTP server 1 -> :2526 -> hcout1 |
| Transactional / verification | 207.174.124.71 + co.carrierone.com (.15) | perfwest | default smtp_bind_address (.71) + :587 relay (.15) |
| Removed 2026-06-23 (snowshoe cleanup) | .90-.93, .95-.106, .108-.109 | mta01-04/06-17, hcmta02-03 | transports + host IP bindings DELETED |
Snowshoe IP cleanup (2026-06-23): the 18 dormant sending IPs (.90-.93,
.95-.106, .108-.109) were fully removed from BOTH postfix (master.cf
transports yahooslow/out02-04/out06-20/rehab02-04/2527/2528/
hcout2/hcout3) AND the host (/etc/network/interfaces + live ip addr del).
Only the two warm sending IPs (.94 trucking, .107 HC) plus infra (.71/.72)
remain bound. A 20-IP footprint reads as snowshoe spam and was hurting domain
reputation; the SPF was already trimmed to .94/.107 on 2026-06-19, so this just
makes the host/postfix match the SPF intent. Verified live: postfix check OK,
both streams still status=sent post-change, SSH unaffected. Reference snapshots
committed at infra/postfix/live-snapshots/master.cf + infra/network/interfaces
(live backups /root/master.cf.bak_snowshoe_* + /root/interfaces.bak_snowshoe_*).
Root SPF (trimmed 2026-06-19): v=spf1 a mx ip4:207.174.124.15 ip4:207.174.124.94 ip4:207.174.124.107 -all — a=.71, mx=co.carrierone.com(.15),
plus the two bulk IPs. The old 21-IP record was a snowshoe signal; this matches
carrierone.com's tight style.
To re-expand after reputation is established: add transports back to ALL=()
in infra/postfix/pw-mta-warmup.sh and re-enable the HC SMTP servers (ports
2527/2528) in the listmonk_hc DB settings.smtp. Re-expand SLOWLY (one IP at a
time, days apart) and only after Postmaster Tools shows a green/medium reputation.
If you re-expand, also add the IPs back to BOTH the root SPF and the send
subdomain SPF.
DNS automation (Hestia is the master)
DNS is fully automatable — Hestia (cp.carrierone.com, 207.174.124.22) is the
DNS master; HE.net are slaves. Access: ssh -p 22022 root@cp.carrierone.com using
the local workstation's ~/.ssh/id_ed25519 (NOT the app server, NOT justin@
which is SFTP-only). The justin Hestia user owns the performancewest.net zone.
# add (note: Hestia appends the base domain to the RECORD name, so a record at
# send._domainkey.send.performancewest.net needs RECORD = "send._domainkey.send")
v-add-dns-record justin performancewest.net "<record>" <TYPE> "<value>" [prio]
# change / delete (find the numeric id with v-list-dns-records ... plain)
v-change-dns-record justin performancewest.net <id> "<record>" <TYPE> "<value>" "" yes <ttl>
v-delete-dns-record justin performancewest.net <id>
# list
v-list-dns-records justin performancewest.net plain
Each write triggers a ~30s zone rebuild + DNSSEC re-sign; slaves sync via NOTIFY /
SOA refresh, usually within a minute. Verify on @8.8.8.8 AND the master
@207.174.124.22 (the master is authoritative; public resolvers may lag).
Monitoring tools (set these up to SEE reputation directly)
These all require a provider account login + (for Google) a DNS TXT record on HE.net, so they can't be fully automated. Steps are pre-filled below.
🔴 MANUAL 1 — Google Postmaster Tools (Gmail is our biggest blocker)
Gmail's verbatim rejection names "the sending domain", so this is priority #1.
DNS is fully automatable — Hestia (cp.carrierone.com) is the DNS master,
HE.net are slaves. Add records as root: ssh -p 22022 root@cp.carrierone.com
then v-add-dns-record justin performancewest.net "@" TXT '"'"'"<value>"'"'"'
(zone owner is the justin Hestia user; ~30s zone rebuild + slaves sync via the
2h SOA refresh / NOTIFY, usually within a minute).
Status 2026-06-18: TXT added + verified live (record id 14464,
google-site-verification=p8s3RaN5wi81350wToMpdPMho5Gcel4RGT1Q1SXj7vg),
resolving on 8.8.8.8/1.1.1.1/9.9.9.9 and 4/5 HE.net slaves. Owner just needs to
click Verify in the Postmaster console once. Data populates 24-48h after
volume flows from the consolidated IP.
To set up from scratch next time: postmaster.google.com -> +Add domain ->
performancewest.net -> copy the google-site-verification=... token -> add via
the Hestia command above -> Verify.
✅ MANUAL 2 — Microsoft SNDS + JMRP (Outlook/Hotmail/Live) — DONE 2026-06-19
85% of our audience is Microsoft-hosted (M365/Outlook/Hotmail), so this is the single most important monitoring tool. Microsoft already accepts our mail (~1.6% reputation rejects), so this tells us inbox-vs-junk + complaint rates. SNDS is IP-based (register the sending IPs), JMRP is the complaint feedback loop. Both SNDS access and JMRP are now registered for 207.174.124.94 + .107.
2026 URL MIGRATION: Microsoft moved SNDS off
sendersupport.olc.protection.outlook.com. The old/snds/and/pm/links now 308-redirect to the new app atsubstrate.office.com/ip-domain-management-snds/. The footer/help links on that page ("contact sender support", "Privacy", "Microsoft Services Agreement") go to genericmicrosoft.compages — that is normal, they are boilerplate, NOT the broken task. You must click "Log in" (top-right) with a personal Microsoft account FIRST; until you authenticate the "Request Access" / "Junk Mail Reporting Program" links just bounce tologin.microsoftonline.com, which looks like a dead redirect but is the expected auth step. After login the real forms render.
- SNDS — Request Access: open the SNDS app — either the legacy entry
https://sendersupport.olc.protection.outlook.com/snds/ (it 308-redirects to the
new app) or directly
https://substrate.office.com/ip-domain-management-snds/SNDS— then Log in -> left-nav "Request Access" (direct:https://substrate.office.com/ip-domain-management-snds/SNDS/AddNetwork) -> register IPs 207.174.124.94 and 207.174.124.107 (the two live stream IPs; add .90 and .71 if you want full coverage). Verification goes to a role address on the IP's domain (usepostmaster@orabuse@performancewest.net, now live). (NOTE:snds.microsoft.comdoes NOT resolve — do not use it.) ✅ DONE 2026-06-19: access requested/granted for .94 + .107. Data populates over ~24-48h; then check the dashboard for the per-IP RED/YELLOW/GREEN status, spam-trap hits, and complaint rate. - JMRP: same site, left-nav "Junk Mail Reporting Program" (direct:
https://substrate.office.com/ip-domain-management-snds/SNDS/Jmrp) -> register the same IPs + complaint-destination mailboxfbl@performancewest.net. Complaints then arrive as ARF emails. ✅ DONE 2026-06-19: both IPs registered as feeds —pw1= 207.174.124.94,pw2= 207.174.124.107, complaint destination set tofbl@performancewest.net(live, routes to ops@). ARF complaint reports now land there automatically.
✅ PREREQ DONE (2026-06-19): the role mailboxes Microsoft needs now exist and
deliver. Created as Carbonio distribution lists routing to ops@performancewest.net:
postmaster@, abuse@, fbl@, dmarc@ — all verified ACCEPT at the MX +
delivered end-to-end. (They previously REJECTED with 5.1.1, which would have blocked
SNDS verification.) Use postmaster@ or abuse@ for SNDS verification and
fbl@performancewest.net as the JMRP complaint destination.
Carbonio mail admin:
ssh -p 22022 justin@207.174.124.15(the co.carrierone.com mail host; local workstation key, justin has NOPASSWD sudo). Run prov as zextras:sudo -u zextras /opt/zextras/bin/carbonio prov <cmd>(e.g.gaa,gadl,cdl <addr>,adlm <dl> <member>,gdlm <dl>).
✅ MANUAL 3 — Yahoo Complaint Feedback Loop — keys added 2026-06-19
Lowest priority (<1% of audience), but cheap. CFL is DKIM-d= based.
- https://senders.yahooinc.com/complaint-feedback-loop/ -> sign in -> register
the domains
performancewest.netandsend.performancewest.net(CFL keys off the DKIMd=value; bulk mail now signsd=send.performancewest.net). - Set the complaint destination to
fbl@performancewest.net(now live, see above).
✅ ENROLLED 2026-06-19 — both domains show Enrolled in the Yahoo Sender Hub
CFL with reporting email fbl@performancewest.net:
performancewest.net— Enrolled, reportingfbl@performancewest.netsend.performancewest.net— Enrolled, reportingfbl@performancewest.net(Reporting-email code was delivered to fbl@ → ops@ and verified; the Selector column is intentionally blank = match any DKIM selector on the verified domain.)
✅ DNS verification keys added + propagated 2026-06-19 (Hestia TXT, verified on all HE.net slaves + 8.8.8.8/1.1.1.1/9.9.9.9):
performancewest.netTXTyahoo-verification-key=IMx+OO5aKUE1nu9JwP6eSBMfSYZu8VcXjpkvEVXS84w=send.performancewest.netTXTyahoo-verification-key=Ps5hGjVxXgeQcLcxr671YG0/RxzjjL0eqh6vfULubEo=(added alongside the existingsendSPF record; both TXT coexist).
✅ DMARC aggregate reports — DONE 2026-06-19 (dedicated mailbox + parser)
Gmail/Yahoo/Microsoft + dozens of operators (Comcast, Cox, Bell, Mimecast, Cisco
ESA, GMX, mail.com, gosecure, ...) send daily per-IP auth+disposition XML to
dmarc@performancewest.net (DMARC record: p=reject; rua=mailto:dmarc@; ruf=mailto:dmarc@; fo=1).
That mailbox was REJECTING (5.1.1) until 2026-06-19 — we silently lost every
report. Now fully wired:
- Dedicated mailbox.
dmarc@performancewest.netis its own Carbonio account (was a DL -> ops@, which buried ops@ under report XML). Isolated IMAP credential in the server.env(DMARC_IMAP_{HOST,PORT,USER,PASS}), surfaced to the workers container indocker-compose.yml(mirrors theOPS_IMAP_*pattern). The 29 historical reports that had landed in ops@ were moved over via IMAP. - Parser worker.
scripts/dmarc_report_parser.pyIMAP-fetches unseen messages, decompresses the.gz/.zip/.xmlattachment (namespace-agnostic — handles both the classic and theurn:ietf:params:xml:ns:dmarc-2.0GMX/mail.com schema), parses the aggregate XML, and upserts onedmarc_reportrow (keyed(org_name, report_id), so re-parsing is a no-op) + onedmarc_recordrow per source IP into the schema fromapi/migrations/102_dmarc_aggregate.sql.dmarc_pass = dkim_aligned=pass OR spf_aligned=pass. Marks each message\Seenso each run only handles new reports. Flags:--dry-run,--all(backfill seen),--alert(7-day per-IP summary + Telegram if one of OUR IPs drops below 95% pass, or an EXTERNAL IP sends >=20 failing msgs as us = spoofing underp=reject). - Cron.
/etc/cron.d/pw-dmarc-parser(tracked atinfra/cron/pw-dmarc-parser) runs... workers python3 -m scripts.dmarc_report_parser --alertdaily at 06:20 UTC.
Query examples once populated:
-- who sends as us, and are they aligning? (the payoff of the DKIM/subdomain fixes)
SELECT source_ip, sum(msg_count) total,
sum(msg_count) FILTER (WHERE dmarc_pass) pass,
round(100.0*sum(msg_count) FILTER (WHERE dmarc_pass)/sum(msg_count)) pass_pct
FROM dmarc_record r JOIN dmarc_report rep ON rep.id=r.report_id
WHERE rep.date_begin >= now()-interval '7 days'
GROUP BY source_ip ORDER BY total DESC;
-- any UNKNOWN IP failing alignment = spoofing/forgotten relay (reputation poison)
Ongoing hygiene (reduce reputation damage)
- Dead-address scrub: ~110 genuine
5.1.1 user unknownbounces/day. listmonk already blocklists hard bounces after 1 (bounce.actions hard->blocklist), so these self-clean, but pre-scrubbing the dirtiest segments before send avoids the reputation hit. Seedata/segment exports. - Consumer-domain exclusion (two layers). The authoritative list lives in
scripts/_email_exclusions.py(BLOCKED_EMAIL_DOMAINS): gmail/google, the full Yahoo/Verizon-Media family, Microsoft consumer, Apple/iCloud (added 2026-06-19), dead/legacy ISPs, and the legal do-not-contact list.- NEW selections: the per-vertical builders filter it out of audience SQL and
listmonk_import.pyrefuses to import a blocked address. - Already-imported subs: LIST-BASED campaigns (FCC Direct Contacts list 3,
CRTC/USF blasts) can still hit consumer subs imported BEFORE a domain joined
the list.
scripts/scrub_listmonk_consumer.pyreconciles the live subscriber table against the exclusion list and blocklists any ENABLED match (idempotent;--dry-runsupported; bothlistmonk+listmonk_hc). Runs daily 06:30 UTC via/etc/cron.d/pw-listmonk-scrub(tracked atinfra/cron/pw-listmonk-scrub). First run 2026-06-19 blocklisted 7,943 trucking + 21 HC stale consumer subs (1,321 iCloud, 267 gmail, etc.) that were leaking via the running CRTC campaign. Re-run the scrub whenever you add a domain to the exclusion list.
- NEW selections: the per-vertical builders filter it out of audience SQL and
- Don't re-expand IPs until Postmaster Tools shows recovered reputation.
- Volume discipline: keep the global 200/hr sliding window until reputation is green; concentrated low volume on one warm IP beats bursts.
- Watch the rejection mix:
5.7.1 reputation/spam/blockedshould fall over the next 1-2 weeks as the single-IP reputation builds. Track via:ssh ... 'sudo grep status=bounced /var/log/mail.log | grep -c 5.7.1'