mail: close MX-exclusion gaps — exclude consumer mx: operators + add mx-tag cron
Fix 1 (build_trucking_campaigns.py): the warmup big-MX exclusion only covered the clean-label operators (google/microsoft/proofpoint/...). Consumer mailbox operators that mx_tag_carriers.py labels with an "mx:" prefix slipped BOTH the exclusion and the per-MX throttle -- notably mx:yahoodns.net (283k sendable carriers = Yahoo Small Business/AOL custom domains) and mx:icloud.com (25k), plus comcast/charter/centurylink/windstream/tds/earthlink. These are custom domains whose MX points at a consumer provider, invisible to the literal-domain blocklist. Added CONSUMER_MX_OPERATORS, folded into WARMUP_EXCLUDE_OPERATORS used by both the fetch_carriers() exclusion SQL and mx_daily_caps() (same day-30 ramp). Behind the existing MAIN_SKIP_BIG_MX switch. Validated read-only: after the fix the warmup-eligible pool is 353,909 carriers (315,892 untagged + ~38k genuinely small/self-hosted operators), so the long tail still sustains the daily quota -- not starved -- while 0 consumer-MX carriers are selected during warmup. Fix 3 (infra/cron/pw-mx-tag): mx_tag_carriers.py was on no cron, so the untagged (NULL) backlog (~316k) never drained and new FMCSA imports stayed untagged, slowly re-opening the gap. Added a daily 05:45 UTC cron (--only-unsent --limit-domains 20000), before the 08:00 builder. Idempotent/bounded (only tags mx_provider IS NULL). Verified live: a 200-domain test run tagged 216 domains. (Fix 2 -- bounding the NULL bucket cap -- deferred; the cron will drain it.)
This commit is contained in:
parent
285a4a087c
commit
9eeed47c4b
2 changed files with 48 additions and 9 deletions
19
infra/cron/pw-mx-tag
Normal file
19
infra/cron/pw-mx-tag
Normal file
|
|
@ -0,0 +1,19 @@
|
|||
# Daily MX tagging for the FMCSA carrier audience. Resolves the MX records of
|
||||
# carrier email domains that don't yet have an mx_provider and stores the
|
||||
# receiving operator (google/microsoft/proofpoint/... or an "mx:<root>" label for
|
||||
# everything else). The trucking campaign builder uses mx_provider to EXCLUDE the
|
||||
# big + consumer mailbox operators during warmup (Google/MS/Yahoo/iCloud/... all
|
||||
# throttle or complaint-block a cold IP) and to per-operator-throttle the rest.
|
||||
#
|
||||
# WHY a cron: mx_provider was previously only tagged by hand, so the untagged
|
||||
# (NULL) backlog never drained (~316k sendable carriers on 2026-06-20) and every
|
||||
# new FMCSA census import landed untagged. Untagged carriers are KEPT in the warmup
|
||||
# pool (anti-starvation), so an untagged Google/Yahoo domain can slip the exclusion
|
||||
# until it's tagged. Running daily keeps the audience classified so the warmup
|
||||
# exclusion stays effective. Idempotent + bounded: only resolves domains where
|
||||
# mx_provider IS NULL, capped at --limit-domains/run.
|
||||
#
|
||||
# Runs 05:45 UTC, before the 06:10 reputation monitor / 06:20 DMARC / 06:30 scrub
|
||||
# and well before the 08:00 trucking builder, so the day's send sees fresh tags.
|
||||
# --only-unsent prioritizes carriers the builder will actually mail.
|
||||
45 5 * * * deploy cd /opt/performancewest && docker compose exec -T workers python3 -m scripts.mx_tag_carriers --only-unsent --limit-domains 20000 >> /var/log/pw-mx-tag.log 2>&1
|
||||
Loading…
Add table
Add a link
Reference in a new issue