17 KiB
Plan — Dual-Stream Outbound Email (Healthcare hot + Trucking trickle)
Why this exists
Today one global throttle governs all outbound mail: the Listmonk sliding
window (app.message_sliding_window_rate, currently 150/h ramping to a 300/h
hard ceiling ≈ 4k/day) plus a shared Postfix rotation pool (.94/.95/.96).
That ceiling exists to protect consumer-ISP reputation (Gmail / Microsoft /
Yahoo), which is what the FMCSA trucking campaigns mail. The May 30-31 collapse
(29k blast → Gmail 550-5.7.1, Yahoo 421 TSS04, delivery fell to ~13%) is why
the whole warmup/cap machinery exists.
Healthcare's reachable audience is different in kind, so it should NOT be constrained by the same ceiling:
- The cold-emailable NPPES-endpoint slice is "tens of thousands"; a large part is consumer webmail (gmail ~12.4k) but a meaningful tail is practice/clinic domains (their own MX, Google Workspace / Microsoft 365 tenants).
- Practice-domain (institutional) mail does not share the consumer-ISP
snowshoe heuristics that torch the trucking IPs. Its deliverability is
largely independent of the reputation we're protecting on
.94-.96.
Verified audience size (May 2026 NPPES endpoint_pfile, measured)
Classifying every email-formatted endpoint (deduped) with the tightened
Direct/HISP filter (direct, medicity.net, surescripts, updox, maxmd, …)
and the consumer-webmail set:
| segment | rows | NPIs | routing |
|---|---|---|---|
| Direct / HISP | 242,441 | — | parked (DirectTrust-only routing, won't cold-deliver) |
| Consumer webmail | 19,366 | ~19,072 | rides the trucking consumer-discipline stream |
| Institutional (practice domains) | 94,348 | ~92,592 | HEALTHCARE HOT stream |
Institutional spread: 38,873 distinct domains, 76% of which have exactly 1
provider (small practices = our $399 PECOS-revalidation buyer). Top-100 domains
are only 23% of volume → healthy long tail, no single MX gets hammered. (Excludes
a handful of non-prospect giants — va.gov, mail.mil, cvshealth.com,
walgreens.com, wal-mart.com — that we drop in the audience build.)
This sizes the hot stream: at ~92k deliverable institutional addresses a 10k/day ceiling drains the list in ~2 weeks; stuck behind the 4k trucking cap it would take ~23 days AND poison the trucking IPs. Hence the split.
So the goal is stream isolation: let healthcare-institutional mail run hot on its own IPs/cap while trucking keeps trickling on the warmed consumer-facing IPs, with neither able to damage the other.
Honesty caveat (do not skip): the consumer-webmail portion of the healthcare list (gmail/outlook/icloud addresses) is NOT institutional and MUST ride the same cautious consumer-ISP discipline as trucking. "Run healthcare hot" applies ONLY to the practice-domain (non-consumer, non-DirectTrust) segment. We split the healthcare list itself into
healthcare-institutionalvshealthcare-consumerand route each to the matching stream.
Architecture: two independent streams, one Postfix, one Listmonk
flowchart TD
LM[Listmonk] -->|SMTP server A: 172.18.0.1:25\nhello perfwest...| PFA[Postfix submission]
LM -->|SMTP server B: 172.18.0.1:2526\nhello hc-mta...| PFB[Postfix submission hc]
PFA --> TR{transport map}
PFB --> TRH{transport_maps hc}
TR -->|yahoo family| HOLD[hold:]
TR -->|consumer + everything else| ROT[randmap rotation\nout05..out20\n.94-.109]
TRH -->|practice domains| HCROT[randmap hc pool\nhcout1..hcout4\n.107-.109 + spare]
ROT --> NET1[(consumer ISPs:\nGmail / MS, capped low)]
HCROT --> NET2[(practice MX /\nWorkspace / M365, hot)]
Two coordinated changes:
1. Postfix: a dedicated healthcare submission service + IP sub-pool
- Carve 2-3 IPs out of the existing 20 (
.107/.108/.109=out18/19/20, currently unused at the warmup tail) into a healthcare-only rotation pool. They get their own HELO (hcmtaNN.performancewest.net— confirm/lay down PTR + SPF first) so healthcare reputation is built and judged separately from trucking. They are removed from the truckingALL=(...)array so the trucking warmup never reclaims them. - Add a second Postfix submission entry in
master.cflistening on a distinct port (e.g.2526) whose injected mail is tagged to the healthcare pool. Two clean ways to bind the pool:- (preferred) sender-dependent / class-based transport: route by the
submission port via a dedicated
cleanup/smtpdservice that sets a header or uses a separatetransport_mapsso healthcare recipients hitrandmap:{hcout1:,hcout2:,hcout3:}. - Simpler alternative: a separate Postfix instance (
postmulti) listening on2526, with its ownmain.cfbound to the hc IPs. More isolation, more moving parts. Decide in step 0 (recommend the single-instance class-based route unless isolation is required).
- (preferred) sender-dependent / class-based transport: route by the
submission port via a dedicated
- Keep the Yahoo-family
hold:backstop in BOTH transports. Healthcare list is pre-filtered, but defense in depth.
2. Listmonk: a second SMTP server, used only by healthcare campaigns
Listmonk's settings.smtp is a JSON array and already supports multiple SMTP
servers. Add a second entry:
{ "host":"172.18.0.1", "port":2526, "uuid":"healthcare",
"enabled":true, "hello_hostname":"hcmta.performancewest.net",
"max_conns":4, "tls_type":"none", "auth_protocol":"none" }
Listmonk round-robins across enabled SMTP servers, so to keep streams isolated we do NOT rely on per-campaign SMTP selection (Listmonk lacks native per-campaign SMTP pinning). Instead we isolate by separate Listmonk instances OR by the cleaner operational split below. Decide in step 0:
- Option A — second Listmonk instance (
listmonk-hc) on the same Postgres, separateapp.message_sliding_window_rate, pointed only at port2526. Cleanest isolation of caps; ~zero risk of cross-stream throttle coupling. This is the recommended option because the whole point is independent caps. - Option B — one Listmonk, single SMTP server B for healthcare, and we accept Listmonk's single global cap by running trucking and healthcare in non-overlapping send windows. Cheaper but couples the caps (defeats the goal).
→ Recommend Option A (second listmonk-hc service in compose). It gets its
own app.message_sliding_window_rate (the healthcare cap), its own SMTP server
(port 2526 → hc IPs), and shares the contacts DB only if we want (probably
separate DB to keep bounce/complaint reputation accounting clean per stream).
Healthcare-stream cap (institutional segment)
Institutional B2B mail tolerates much higher volume than consumer cold mail, but
we still warm the new hc IPs (they're fresh) and we still respect per-domain
practice MX limits. Proposed hc warmup (separate stamp /etc/postfix/hc-warmup-start):
| hc warmup day | hourly cap | ~daily | notes |
|---|---|---|---|
| 0-1 | 100/h | ~1,000 | brand-new hc IPs, prove clean |
| 2-4 | 300/h | ~3,000 | |
| 5-9 | 600/h | ~6,000 | |
| 10+ | 1,000/h | ~10,000 | institutional ceiling; revisit with data |
These are separate from and additive to the trucking ~4k/day ceiling, because they hit a disjoint set of receiving systems on disjoint sending IPs.
Per-domain politeness still applies (smtp_destination_concurrency_limit,
smtp_destination_rate_delay) so we never hammer one clinic's MX.
Audience split (must happen before any send)
Extend scripts/build_npi_outreach_lists.py (or a thin post-processor) to emit
THREE files instead of lumping cold together:
npi_healthcare_institutional.csv— cold, non-Direct, non-consumer-webmail (practice/clinic domains). → healthcare HOT stream.npi_healthcare_consumer.csv— cold consumer webmail (gmail/outlook/icloud…). → rides the TRUCKING consumer-discipline stream (low cap), NOT the hot one.npi_direct_secure.csv— DirectTrust/HISP. → parked until DirectTrust signup.
Classification rule: institutional = cold channel AND domain NOT in
CONSUMER_WEBMAIL AND not Direct. (We already compute cold/direct and a
cold_consumer count; just split on the consumer set.)
Always run the existing free MX + SMTP RCPT verification on a NON-sending IP
(doc sec 8.2) over the institutional list before importing, so we never mail
dead practice mailboxes (550 5.1.1 from a clinic MX still hurts the hc IPs).
Reputation hygiene (per stream, independent)
- Separate PTR/FCrDNS (
hcmtaNN.performancewest.net) + separate SPF authorization for the hc IPs (still under the same domain so DKIM/DMARC pass). - DKIM/DMARC unchanged (domain-level) — healthcare mail still signs as performancewest.net, which is fine and desirable.
- Separate bounce/complaint monitoring per pool (grep by hc IP / by hc syslog_name). The existing monitoring commands extend trivially with the hc IPs.
- A healthcare ramp-cap script (
pw-hc-rampcap) mirroringpw-listmonk-rampcapbut driving thelistmonk-hccap off/etc/postfix/hc-warmup-start.
Concrete ordered steps
- Decide: single Postfix instance + class-based hc transport vs
postmulti; and Listmonk Option A (2nd instance) vs B. (Recommend: single instance + class transport, and Listmonk Option A.) - DNS/identity: add PTR
hcmtaNNfor.107/.108/.109, extend SPF, confirm DKIM/DMARC still pass for those IPs. (No send until green.) - Postfix: new submission service on
:2526; carveout18/19/20into an hc rotation pool; remove them from the truckingALLarray; add thehc-warmup-startstamp +pw-hc-mta-warmup. Keep Yahoohold:backstop. - Listmonk-hc: add
listmonk-hccompose service (same image, ownLISTMONK_app__*cap env / settings, SMTP server =172.18.0.1:2526), behind nginx at a separate vhost or path. Wirepw-hc-rampcap. - Audience: extend the list builder to emit the 3 split files; run free MX + SMTP verification (non-sending IP) on the institutional file.
- Campaign: build a healthcare-institutional campaign (revalidation-overdue
first → free NPI tool link → $399 PECOS Revalidation product), import the
verified institutional list into
listmonk-hc, send small focused batches. - deploy wiring: add the new services/scripts to
deploy.sh/deploy-dev.shand ansible templates, mirroring the proxy-relay pattern just landed.
Validation
- Isolation proof: send a trucking batch and an hc batch simultaneously;
confirm via
mail.logthat trucking mail egresses ONLY from.94-.96and hc mail ONLY from.107-.109, and that each respects its own cap independently. - Identity proof: an hc test send to a mail-tester/aboutmy.email account
shows PTR
hcmtaNN, SPF pass, DKIM pass, DMARC pass. - Deliverability proof: hc test sends to a Google Workspace test domain + an M365 test domain land in inbox (not spam); record per-domain disposition.
- Cap proof:
pw-hc-rampcapsets thelistmonk-hccap from the hc warmup day and does NOT touch the trucking Listmonk cap (and vice-versa). - No regression: trucking delivery mix unchanged after the split (same
monitoring commands, same
.94-.96volumes).
Decisions (locked)
- Postfix: single instance + class-based hc transport (port
:2526→ hc rotation pool). Nopostmulti. - Listmonk: a second instance (
listmonk-hc) with its own sliding-window cap → true cap isolation. - Institutional ceiling: 10k/day (warm up to it).
- Contacts DB: separate (
listmonk_hcdatabase) — cleaner per-stream bounce/complaint accounting, and the hc instance needs its own DB anyway. - Audience count: measured — ~92,592 institutional NPIs / 38,873 domains (see table above).
Open / for-later
- How aggressive on the institutional ceiling beyond 10k/day — raise only with clean delivery data.
- DirectTrust signup to unlock the 242k Direct/HISP segment (separate effort).
Implementation status (built + validated)
Committed and validated on dev:
- Audience split —
scripts/healthcare_email_streams.py(shared classifier)- reworked
scripts/build_npi_outreach_lists.pyemitnpi_healthcare_institutional/consumer.csv+npi_direct_secure.csv. Verified on May 2026 NPPES: 89,557 institutional rows.
- reworked
- Postfix hc stream —
infra/postfix/hc_stream_setup.shapplied on the app server: ports 2526/2527/2528 -> hcout1/2/3 -> IPs .107/.108/.109 (HELO hcmta01-03). Proven: a send on :2527 egressed via hcout2 (.108) to the real gmail MX; trucking transport_maps (.94-.96) untouched. - listmonk-hc — second instance (own
listmonk_hcDB, own cap), 3 SMTP servers = the 3 hc ports. Proven on dev: listmonk-hc container -> host :2526 (hcsubmit107) -> hcout1 (.107) -> real gmail MX. - Ramp-cap —
infra/postfix/pw-hc-rampcap.sh(100->1000/h off/etc/postfix/hc-warmup-start), independent of the trucking ramp. - Deploy wiring — deploy.sh/deploy-dev.sh bring up listmonk-hc;
docker-compose.dev.override.ymlkeeps dev (shared host) from clashing on prod host ports / postgres volume.
REMAINING before any healthcare send (manual, needs Justin/DNS)
-
PTR / FCrDNS for the hc IPs — ✅ DONE 2026-06-06.
.107->hcmta01,.108->hcmta02,.109->hcmta03(.performancewest.net), plus matching forward A records, verified resolving on the authoritative NS AND HE.net secondaries (SOA serial in sync). FCrDNS confirmed both ways.How (for future reference): HestiaCP box
cp.carrierone.com=207.174.124.22, SSH port 22022 (not 22).admin@is sftp-only, butroot@.22:22022accepts our default~/.ssh/id_ed25519→ full shell + Hestia CLI. Forward zoneperformancewest.netand reverse zone124.174.207.in-addr.arpaare both owned by Hestia userjustin; HE.net auto-zone-transfers (secondaries). Commands used:export PATH=$PATH:/usr/local/hestia/bin # forward A: USER DOMAIN RECORD TYPE VALUE v-add-dns-record justin performancewest.net hcmta01 A 207.174.124.107 # reverse PTR: USER REVZONE OCTET PTR FQDN. "" "" <restart yes/no> v-add-dns-record justin 124.174.207.in-addr.arpa 107 PTR hcmta01.performancewest.net. "" "" yes v-delete-dns-record justin 124.174.207.in-addr.arpa <ID> no # remove stale v-rebuild-dns-domain justin 124.174.207.in-addr.arpa # bump serial(Also removed pre-existing duplicate
mta18-20PTRs in the reverse zone.) NOTE: the workers'hestia_provisioner.pypath (admin@:22 + mounted key) remains unfinished/unused — the working path is root@:22022 with our key. -
SPF/DKIM/DMARC — ✅ VERIFIED 2026-06-06. SPF already authorizes
.107/.108/.109explicitly and ends-all(only 2 DNS-lookup mechanisms,a mx— safe under the 10 limit). DKIM selectormailpublished (2048-bit). DMARCp=quarantine; pct=100; rua=dmarc@. All domain-level, no change needed. -
Install on prod — ✅ DONE 2026-06-06.
- Postfix hc stream already live on the app host (Postfix is co-located):
ports
2526/2527/2528→content_filter=hcout1/2/3:→smtp_bind_address.107/.108/.109+ HELOhcmta01/02/03. Verified in master.cf. listmonk_hcDB existed (ownerpw, was empty); randocker compose run --rm --entrypoint /bin/sh listmonk-hc -c './listmonk --install --idempotent --yes --config /listmonk/config.toml'→ 16 tables, superadminapicreated.docker compose up -d listmonk-hc→ container Up,:9101→ 200.- 3 SMTP servers configured directly in the
listmonk_hc.settingstable (the env-installed admin is a UI user, not an API-token user, so the REST API rejects basic-auth; DB update is the clean path). Each points at172.18.0.1:2526/2527/2528(docker bridge gateway → host Postfix hc ports),auth_protocol=none,tls_type=none,max_conns=2,hello_hostname=hcmta0N. Restart loaded "3 SMTP messengers". - End-to-end validated: submitted one probe through each of 2526/2527/2528;
maillog shows each routed via its own
hcout1/2/3, established a Trusted TLS connection to gmail-smtp-in.l.google.com:25, and got a genuine Gmail550-5.1.1 NoSuchUser(expected for the dummy recipient) — i.e. no PTR/SPF/reputation rejection, FCrDNS accepted from all 3 hc IPs. - ✅
pw-hc-rampcapinstalled at/usr/local/bin/+/etc/cron.d/pw-hc-rampcap(daily 07:20, mirrors the trucking rampcap). The hc warmup stamp/etc/postfix/hc-warmup-startexists (created byhc_stream_setup.sh), so the ramp is on day 0 → cap 100/h (sliding window, 1h). Ramps to 1000/h by day 10. Nothing sends until a list is imported.
- Postfix hc stream already live on the app host (Postfix is co-located):
ports
-
Verify identity — ⚠️ PARTIAL. The live-send probes already prove Gmail accepts mail from
.107/.108/.109with no PTR/SPF/reputation rejection (only the dummy-recipient550 NoSuchUser). Still worth a mail-tester.com / aboutmy.email run from an hc IP (send to their probe address through listmonk-hc) to confirm the numeric score (DKIM-signed, DMARC aligned, content spamassassin score) BEFORE the first real batch. Not started. -
Free MX+SMTP verify the institutional CSV on a non-sending IP, import the verified file into listmonk-hc, send small focused batches (overdue-first).