The 3 IPs (mta02-04 / .91-.93) retired after the May 30-31 over-volume blast are
NOT on any DNSBL (Spamhaus/Barracuda/SpamCop/SORBS all clean) and have clean PTRs
+ SPF/DKIM/DMARC -- the damage was provider-internal reputation, which recovers
with slow clean sending. scripts/ip_rehab.py sends a tiny ramping trickle
(10/IP/day -> cap 60) of genuine CAN-SPAM-compliant compliance check-in mail to
clean business-domain, never-bounced recipients via dedicated heavily-throttled
postfix transports rehab02/03/04 (30s/msg, bound to .91/.92/.93). Routing uses an
X-PW-Rehab-IP header + header_checks FILTER to override the transport_maps randmap
warmup rotation (verified: mail routes via rehab transports, status=sent). Daily
cron pw-ip-rehab. After ~2-3 weeks of clean sending the IPs can be reallocated.
The HC warmup builder ran from cron at 07:00 but the >> /var/log/pw-hc-campaign.log
redirect failed (deploy user cannot write /var/log), and a failed output redirect
makes cron abort the command BEFORE it runs -> HC sent 0/day since the log file was
removed. Route HC cron logs to /opt/performancewest/logs/ (deploy-owned) so the
redirect always succeeds. Builder itself was fine (verified: imports + sends work,
0 bounces). Also removed the stale 'campaign-warmup.sh 122' root-cron line that
pointed at a finished campaign + no longer existed.
End-of-day (20:00 Central) check of campaign deliverability across both sending
pools (main out05-09 + healthcare hcout). Sends a Telegram alert ONLY when there
is a reputation problem -- delivery below 65% or a spam/policy-block (550-5.7.1)
spike above 150/day -- so healthy days stay silent. Reuses the existing
TELEGRAM_BOT_TOKEN/CHAT_ID from /opt/performancewest/.env. Logs every run to
/var/log/pw-warmup-healthcheck.log for history. Excludes internal/probe noise so
the delivery figure reflects real external recipients.
The 'already revalidated' replies come from the CMS data-lag window (a provider
completes their revalidation but CMS's public Due Date List still shows them
overdue for weeks). Running the refresh 3x/week instead of weekly shrinks that
window from up to 7 days to ~2-3, so a provider who just completed stops being
targeted faster. No change to the overdue window or audience size -- this is the
lever that reduces stale-data complaints without losing prospects.
Belt-and-suspenders for the edge you flagged: a domain already in a warmup list
could flip its MX to Google Workspace between weekly refreshes, after which it
would hard-bounce from the cold IP. The import-time guard only catches NEW adds.
- prune_holdouts(): enumerates each warmup list's subscribers, matches them
against the FRESH master CSV (re-classified weekly), and removes any whose
domain is now Google-hosted. DELIVERABILITY-ONLY -- it never evicts for
audience reasons (an overdue provider drifting out of the 1-90 day window was
a valid target when warmed; re-litigating that just wastes warmup progress).
- --prune (run alongside warming) and --prune-only (prune then exit).
- Wired into the weekly refresh cron as a --prune-only chained step, so MX is
re-checked and holdouts removed every Monday before the weekday sends.
Verified end-to-end: with no Google domains in lists it's a 0-op; injecting a
simulated Google-flipped domain into the master, the prune correctly detects and
(in a real run) would remove it from every list it's on.
Mailing heavily-overdue NPIs (months/years past due) risks hitting practices
that have closed, merged, or abandoned the inbox -> hard bounces, which are the
fastest way to wreck a warming IP's reputation. The warmup now restricts the
reval_overdue selector to an inclusive [HC_OVERDUE_MIN, HC_OVERDUE_MAX] window
(default 1-90 days) and the OIG 'any' selector likewise excludes heavily-overdue
and dropped-off-list rows. On the current cohort this trims the overdue audience
178->96 and the OIG audience 399->317, holding out the stale long tail
(181-365d + 366d+). upcoming/active providers are unaffected.
Tracks the deployed cron.d files in the repo:
- pw-hc-campaign: updated comment to reflect the now multi-segment warmup
(revalidation + OIG + NPPES + reactivation + bundle); command unchanged.
- pw-hc-refresh (NEW): Mon 06:00 Central weekly data refresh, ~1h before the
07:00 weekday send, so every send uses fresh CMS/OIG status.