feat(deliverability): burner-domain list verification + plan doc
The smtp_valid pool is only ~3k unsent — too small to sustain campaigns. SMTP probing can't confirm catch-all/mx_unreachable deliverability; only a REAL send can. burner_list_verify.py reconciles a verification send from a DISPOSABLE burner domain (isolated from PW/carrierone reputation): - hard bounce -> fmcsa_carriers.email_verify_result='hard_bounced' (excluded) - delivered -> 'send_confirmed' (proven deliverable; PW campaigns send to it) It tails the burner MTA mail.log (reuses bounce-watcher's status= pattern) and writes back idempotently. The PW trucking filter now treats smtp_valid + send_confirmed as sendable. docs/campaign-deliverability-plan.md captures the full diagnosis, the burner design, and CAN-SPAM guardrails. Remaining (needs a domain + isolated MTA identity — operator/infra decision): stand up the burner domain, the verification-send worker, and a writeback cron.
This commit is contained in:
parent
1652a3b8bc
commit
c2737f2001
3 changed files with 258 additions and 5 deletions
94
docs/campaign-deliverability-plan.md
Normal file
94
docs/campaign-deliverability-plan.md
Normal file
|
|
@ -0,0 +1,94 @@
|
|||
# Campaign Deliverability — Diagnosis & List-Verification Plan
|
||||
|
||||
_Created 2026-06-17 after trucking conversions went to zero._
|
||||
|
||||
## TL;DR
|
||||
|
||||
Trucking conversions stopped on **June 9** not because campaigns stopped sending
|
||||
(they send ~2,400/day with ~1,800 opens/3 days) but because a **filter bug was
|
||||
blasting ~438k dead `mx_unreachable` domains**, producing a **~47% hard-bounce
|
||||
rate (~1,100/day)** that blocklisted **half the 120k subscriber base** and
|
||||
torched sender reputation, so real prospects never saw the offer.
|
||||
|
||||
- **Fixed** (`build_trucking_campaigns.py`): send filter now keys only off
|
||||
`email_verify_result` (never the broken `email_verified` boolean), and defaults
|
||||
to **recovery mode = `smtp_valid` only** until reputation recovers. Set
|
||||
`CAMPAIGN_INCLUDE_CATCH_ALL=1` to re-add catch-all domains afterward.
|
||||
- **Healthcare is fine** — separate instance (`listmonk-hc` / DB `listmonk_hc`),
|
||||
cleaned list (`clean_hc_warmup_list.py` already drops `mx_unreachable`), bounce
|
||||
rate ~2-3%. No change needed; it proves the fix is correct.
|
||||
|
||||
## Why the SMTP-probe verification under-counts deliverable addresses
|
||||
|
||||
`email_verifier.py` does syntax → MX → SMTP `RCPT TO`. Results:
|
||||
|
||||
| result | count | sendable? | why |
|
||||
|---|---|---|---|
|
||||
| `catch_all_domain` | 1,082,817 | risky | domain accepts ALL rcpts at SMTP time, then may bounce later |
|
||||
| `mx_unreachable` | 438,163 | **NO** | MX exists but never answered the probe — **hard-bounces on real send** |
|
||||
| `smtp_valid` | 11,774 | **YES** | an MX explicitly accepted this exact mailbox |
|
||||
| `no_mx_records` / `invalid_syntax` / `smtp_rejected_550` | ~46k | no | dead |
|
||||
|
||||
The probe can only *confirm* a mailbox on non-catch-all domains that answer the
|
||||
RCPT handshake — which is a small slice. Only **~3,042 `smtp_valid` are still
|
||||
unsent**, so recovery mode will exhaust the clean pool in ~1 day. **We need a way
|
||||
to grow the verified-deliverable list without burning PW's reputation.**
|
||||
|
||||
## The real fix: burner-domain bounce verification
|
||||
|
||||
SMTP-probe verification is unreliable (catch-alls mask validity; many MTAs refuse
|
||||
probes but accept real mail). The only ground truth is **actually send a message
|
||||
and see if it bounces.** But doing that from PW's domain is what got us here. So:
|
||||
|
||||
### Design
|
||||
|
||||
1. **Dedicated throwaway verification domain** (NOT performancewest.net and NOT
|
||||
carrierone.com — both are reputation assets we must protect). Register a cheap
|
||||
neutral `.com` via Porkbun (we already have the Porkbun integration). Give it
|
||||
its own SPF/DKIM/DMARC and a dedicated sending IP/identity (separate postfix
|
||||
instance or a transactional provider sub-account that isolates reputation).
|
||||
|
||||
2. **Send a low-key, CAN-SPAM-compliant, non-commercial verification email** to
|
||||
the unverified pool (e.g. a plain "is this the right contact for <DOT#>?" or a
|
||||
bland newsletter-style note with a working unsubscribe). It must be a real,
|
||||
legitimate message — never deceptive — but its ONLY purpose is to elicit a
|
||||
delivered-vs-bounced signal. Throttled and warmed like any send.
|
||||
|
||||
3. **Catch bounces from that domain's own MTA log** (reuse `bounce-watcher.sh`'s
|
||||
`status=bounced` tail pattern) and **write the result back to
|
||||
`fmcsa_carriers.email_verify_result`**:
|
||||
- delivered (no bounce within N hours) → upgrade to a new `send_confirmed`
|
||||
result that the PW campaign filter treats as sendable.
|
||||
- hard-bounced → mark `hard_bounced`, permanently excluded from PW sends.
|
||||
|
||||
4. **PW campaigns then send only to `smtp_valid` + `send_confirmed`** — addresses
|
||||
proven deliverable by a real send — keeping PW's bounce rate near zero.
|
||||
|
||||
### Why a separate domain/IP
|
||||
|
||||
Reputation is per sending-domain + per-IP. If the burner domain gets blocklisted
|
||||
from the inevitable bounces during scrubbing, **PW and carrierone are untouched.**
|
||||
The burner is disposable: if it burns, rotate to a new one. PW only ever sends to
|
||||
the cleaned output.
|
||||
|
||||
### Compliance guardrails (must-haves)
|
||||
|
||||
- Real **CAN-SPAM** compliance: truthful from/subject, physical address, working
|
||||
one-click unsubscribe, honor opt-outs immediately (sync opt-outs back to PW's
|
||||
suppression list too).
|
||||
- **Not deceptive**: the email is a genuine message (these are public FMCSA
|
||||
business contacts for B2B outreach), not a fake/pretext. The bounce signal is a
|
||||
byproduct, not a trick.
|
||||
- Suppress anyone who ever bounced or opted out from ALL future sends (burner and
|
||||
PW).
|
||||
|
||||
## Status / next steps
|
||||
|
||||
- [x] Fix the PW trucking send filter (drop `mx_unreachable`; recovery mode).
|
||||
- [x] Confirm healthcare unaffected.
|
||||
- [ ] Add `send_confirmed` / `hard_bounced` result handling to the campaign
|
||||
filter + a writeback path from bounce processing.
|
||||
- [ ] Stand up the burner verification domain + isolated MTA identity.
|
||||
- [ ] Build the verification-send + bounce-writeback worker.
|
||||
- [ ] Re-verify the `catch_all_domain` + `mx_unreachable` pools through the burner
|
||||
to grow the PW-sendable list.
|
||||
|
|
@ -339,11 +339,13 @@ REPLY_TO_HEADERS = [{"name": "Reply-To", "value": REPLY_TO_EMAIL}]
|
|||
# blocklisted). So 'mx_unreachable' and all error/reject results are excluded.
|
||||
#
|
||||
# Recovery mode (default ON while reputation is damaged): send ONLY 'smtp_valid'
|
||||
# — addresses an MX explicitly accepted at RCPT time — to drive the bounce rate
|
||||
# to near-zero and rebuild sender reputation. Once recovered, set
|
||||
# CAMPAIGN_INCLUDE_CATCH_ALL=1 to re-add catch-all domains (which accept at SMTP
|
||||
# time but can still bounce later, so they stay out during recovery).
|
||||
_SENDABLE_RESULTS = ["smtp_valid"]
|
||||
# — addresses an MX explicitly accepted at RCPT time — plus 'send_confirmed'
|
||||
# (addresses proven deliverable by a real burner-domain verification send; see
|
||||
# docs/campaign-deliverability-plan.md). This drives the bounce rate to near-zero
|
||||
# and rebuilds sender reputation. Once recovered, set CAMPAIGN_INCLUDE_CATCH_ALL=1
|
||||
# to re-add catch-all domains (which accept at SMTP time but can still bounce
|
||||
# later, so they stay out during recovery). 'hard_bounced' is NEVER sendable.
|
||||
_SENDABLE_RESULTS = ["smtp_valid", "send_confirmed"]
|
||||
if os.getenv("CAMPAIGN_INCLUDE_CATCH_ALL", "0") not in ("0", "false", ""):
|
||||
_SENDABLE_RESULTS += ["catch_all_domain", "catch_all_detected"]
|
||||
USABLE_FILTER = (
|
||||
|
|
|
|||
157
scripts/burner_list_verify.py
Normal file
157
scripts/burner_list_verify.py
Normal file
|
|
@ -0,0 +1,157 @@
|
|||
#!/usr/bin/env python3
|
||||
"""Burner-domain list verification: write deliverability back to fmcsa_carriers.
|
||||
|
||||
The SMTP-probe verifier (email_verifier.py) can't tell which catch-all /
|
||||
mx_unreachable addresses actually deliver. The only ground truth is a REAL send.
|
||||
We do that from a disposable burner sending domain (NOT performancewest.net /
|
||||
carrierone.com — see docs/campaign-deliverability-plan.md) so the inevitable
|
||||
bounces never touch PW's reputation. This script reconciles that send:
|
||||
|
||||
1. Scan the burner MTA's mail.log for messages FROM the burner sender.
|
||||
2. Any recipient that hard-bounced -> fmcsa_carriers.email_verify_result =
|
||||
'hard_bounced' (permanently excluded from PW campaigns).
|
||||
3. Any recipient that was DELIVERED (status=sent, no later bounce) and is not
|
||||
already smtp_valid -> 'send_confirmed' (proven deliverable; the PW
|
||||
campaign filter treats smtp_valid + send_confirmed as sendable).
|
||||
|
||||
Idempotent: only upgrades 'catch_all_*' / 'mx_unreachable' / NULL rows to
|
||||
'send_confirmed', and only sets 'hard_bounced' on a real bounce. Never downgrades
|
||||
an already-confirmed address except to mark a genuine bounce.
|
||||
|
||||
Usage:
|
||||
python3 -m scripts.burner_list_verify --log /var/log/burner-mail.log
|
||||
python3 -m scripts.burner_list_verify --log mail.log --dry-run
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import os
|
||||
import re
|
||||
import sys
|
||||
|
||||
import psycopg2
|
||||
|
||||
DATABASE_URL = os.getenv("DATABASE_URL", "")
|
||||
|
||||
# Sender(s) used by the burner verification campaign. Override via env when the
|
||||
# burner domain is provisioned (e.g. BURNER_SENDERS="verify@listcheck-xyz.com").
|
||||
BURNER_SENDERS = {
|
||||
s.strip().lower()
|
||||
for s in os.getenv("BURNER_SENDERS", "").split(",")
|
||||
if s.strip()
|
||||
}
|
||||
|
||||
QID_RE = re.compile(r"postfix/\w+\[\d+\]: ([A-Z0-9]+):")
|
||||
FROM_RE = re.compile(r"from=<([^>]*)>")
|
||||
TO_RE = re.compile(r"to=<([^>]*)>")
|
||||
STATUS_RE = re.compile(r"status=(\w+)")
|
||||
|
||||
# Results we are allowed to UPGRADE to 'send_confirmed'. We never overwrite an
|
||||
# explicit smtp_valid (already best) or a hard_bounced (worse signal wins).
|
||||
UPGRADABLE = ("catch_all_domain", "catch_all_detected", "mx_unreachable",
|
||||
"smtp_temp_error", "smtp_unknown_451", "smtp_unknown_450")
|
||||
|
||||
|
||||
def scan_log(log_path: str) -> tuple[set[str], set[str]]:
|
||||
"""Return (delivered_emails, bounced_emails) for burner-sender messages."""
|
||||
if not BURNER_SENDERS:
|
||||
print("ERROR: set BURNER_SENDERS (e.g. verify@your-burner-domain.com)",
|
||||
file=sys.stderr)
|
||||
return set(), set()
|
||||
|
||||
burner_qids: set[str] = set()
|
||||
qid_rcpt: dict[str, str] = {}
|
||||
delivered: set[str] = set()
|
||||
bounced: set[str] = set()
|
||||
|
||||
with open(log_path, errors="ignore") as f:
|
||||
for line in f:
|
||||
qm = QID_RE.search(line)
|
||||
if not qm:
|
||||
continue
|
||||
qid = qm.group(1)
|
||||
|
||||
fm = FROM_RE.search(line)
|
||||
if fm and fm.group(1).lower() in BURNER_SENDERS:
|
||||
burner_qids.add(qid)
|
||||
|
||||
tm = TO_RE.search(line)
|
||||
sm = STATUS_RE.search(line)
|
||||
if tm and sm and qid in burner_qids:
|
||||
rcpt = tm.group(1).lower()
|
||||
qid_rcpt[qid] = rcpt
|
||||
status = sm.group(1).lower()
|
||||
if status == "bounced":
|
||||
bounced.add(rcpt)
|
||||
elif status == "sent":
|
||||
delivered.add(rcpt)
|
||||
|
||||
# A bounce anywhere wins over a "sent" (deferred-then-bounced).
|
||||
delivered -= bounced
|
||||
return delivered, bounced
|
||||
|
||||
|
||||
def writeback(delivered: set[str], bounced: set[str], dry_run: bool = False) -> dict:
|
||||
"""Apply send_confirmed / hard_bounced to fmcsa_carriers."""
|
||||
stats = {"confirmed": 0, "bounced": 0}
|
||||
if not (delivered or bounced):
|
||||
return stats
|
||||
conn = psycopg2.connect(DATABASE_URL)
|
||||
try:
|
||||
with conn.cursor() as cur:
|
||||
# Hard bounces: always mark (worst signal wins), excludes from PW sends.
|
||||
for email in bounced:
|
||||
if dry_run:
|
||||
stats["bounced"] += 1
|
||||
continue
|
||||
cur.execute(
|
||||
"""UPDATE fmcsa_carriers
|
||||
SET email_verify_result = 'hard_bounced',
|
||||
email_verified = FALSE
|
||||
WHERE lower(email_address) = %s
|
||||
AND email_verify_result IS DISTINCT FROM 'hard_bounced'""",
|
||||
(email,),
|
||||
)
|
||||
stats["bounced"] += cur.rowcount
|
||||
# Delivered: upgrade soft/unknown results to send_confirmed.
|
||||
for email in delivered:
|
||||
if dry_run:
|
||||
stats["confirmed"] += 1
|
||||
continue
|
||||
cur.execute(
|
||||
"""UPDATE fmcsa_carriers
|
||||
SET email_verify_result = 'send_confirmed',
|
||||
email_verified = TRUE
|
||||
WHERE lower(email_address) = %s
|
||||
AND (email_verify_result IN %s OR email_verify_result IS NULL)""",
|
||||
(email, UPGRADABLE),
|
||||
)
|
||||
stats["confirmed"] += cur.rowcount
|
||||
if not dry_run:
|
||||
conn.commit()
|
||||
finally:
|
||||
conn.close()
|
||||
return stats
|
||||
|
||||
|
||||
def main() -> int:
|
||||
ap = argparse.ArgumentParser()
|
||||
ap.add_argument("--log", default="/var/log/burner-mail.log",
|
||||
help="burner MTA mail.log to scan")
|
||||
ap.add_argument("--dry-run", action="store_true")
|
||||
args = ap.parse_args()
|
||||
|
||||
if not os.path.exists(args.log):
|
||||
print(f"log not found: {args.log}", file=sys.stderr)
|
||||
return 1
|
||||
delivered, bounced = scan_log(args.log)
|
||||
print(f"burner scan: {len(delivered)} delivered, {len(bounced)} bounced")
|
||||
stats = writeback(delivered, bounced, dry_run=args.dry_run)
|
||||
tag = "[dry-run] " if args.dry_run else ""
|
||||
print(f"{tag}writeback: send_confirmed +{stats['confirmed']}, "
|
||||
f"hard_bounced +{stats['bounced']}")
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
raise SystemExit(main())
|
||||
Loading…
Add table
Add a link
Reference in a new issue