new-site/docs/campaign-deliverability-plan.md

# Campaign Deliverability — Diagnosis & List-Verification Plan

_Created 2026-06-17 after trucking conversions went to zero._

## TL;DR

Trucking conversions stopped on **June 9** not because campaigns stopped sending
(they send ~2,400/day with ~1,800 opens/3 days) but because a **filter bug was
blasting ~438k dead `mx_unreachable` domains**, producing a **~47% hard-bounce
rate (~1,100/day)** that blocklisted **half the 120k subscriber base** and
torched sender reputation, so real prospects never saw the offer.

- **Fixed** (`build_trucking_campaigns.py`): send filter now keys only off
  `email_verify_result` (never the broken `email_verified` boolean), and defaults
  to **recovery mode = `smtp_valid` only** until reputation recovers. Set
  `CAMPAIGN_INCLUDE_CATCH_ALL=1` to re-add catch-all domains afterward.
- **Healthcare is fine** — separate instance (`listmonk-hc` / DB `listmonk_hc`),
  cleaned list (`clean_hc_warmup_list.py` already drops `mx_unreachable`), bounce
  rate ~2-3%. No change needed; it proves the fix is correct.

## Why the SMTP-probe verification under-counts deliverable addresses

`email_verifier.py` does syntax → MX → SMTP `RCPT TO`. Results:

| result | count | sendable? | why |
|---|---|---|---|
| `catch_all_domain` | 1,082,817 | risky | domain accepts ALL rcpts at SMTP time, then may bounce later |
| `mx_unreachable` | 438,163 | **NO** | MX exists but never answered the probe — **hard-bounces on real send** |
| `smtp_valid` | 11,774 | **YES** | an MX explicitly accepted this exact mailbox |
| `no_mx_records` / `invalid_syntax` / `smtp_rejected_550` | ~46k | no | dead |

The probe can only *confirm* a mailbox on non-catch-all domains that answer the
RCPT handshake — which is a small slice. Only **~3,042 `smtp_valid` are still
unsent**, so recovery mode will exhaust the clean pool in ~1 day. **We need a way
to grow the verified-deliverable list without burning PW's reputation.**

## The real fix: burner-domain bounce verification

SMTP-probe verification is unreliable (catch-alls mask validity; many MTAs refuse
probes but accept real mail). The only ground truth is **actually send a message
and see if it bounces.** But doing that from PW's domain is what got us here. So:

### Design

1. **Dedicated throwaway verification domain** (NOT performancewest.net and NOT
   carrierone.com — both are reputation assets we must protect). Register a cheap
   neutral `.com` via Porkbun (we already have the Porkbun integration). Give it
   its own SPF/DKIM/DMARC and a dedicated sending IP/identity (separate postfix
   instance or a transactional provider sub-account that isolates reputation).

2. **Send a low-key, CAN-SPAM-compliant, non-commercial verification email** to
   the unverified pool (e.g. a plain "is this the right contact for <DOT#>?" or a
   bland newsletter-style note with a working unsubscribe). It must be a real,
   legitimate message — never deceptive — but its ONLY purpose is to elicit a
   delivered-vs-bounced signal. Throttled and warmed like any send.

3. **Catch bounces from that domain's own MTA log** (reuse `bounce-watcher.sh`'s
   `status=bounced` tail pattern) and **write the result back to
   `fmcsa_carriers.email_verify_result`**:
   - delivered (no bounce within N hours) → upgrade to a new `send_confirmed`
     result that the PW campaign filter treats as sendable.
   - hard-bounced → mark `hard_bounced`, permanently excluded from PW sends.

4. **PW campaigns then send only to `smtp_valid` + `send_confirmed`** — addresses
   proven deliverable by a real send — keeping PW's bounce rate near zero.

### Why a separate domain/IP

Reputation is per sending-domain + per-IP. If the burner domain gets blocklisted
from the inevitable bounces during scrubbing, **PW and carrierone are untouched.**
The burner is disposable: if it burns, rotate to a new one. PW only ever sends to
the cleaned output.

### Compliance guardrails (must-haves)

- Real **CAN-SPAM** compliance: truthful from/subject, physical address, working
  one-click unsubscribe, honor opt-outs immediately (sync opt-outs back to PW's
  suppression list too).
- **Not deceptive**: the email is a genuine message (these are public FMCSA
  business contacts for B2B outreach), not a fake/pretext. The bounce signal is a
  byproduct, not a trick.
- Suppress anyone who ever bounced or opted out from ALL future sends (burner and
  PW).

## Status / next steps

- [x] Fix the PW trucking send filter (drop `mx_unreachable`; recovery mode).
- [x] Confirm healthcare unaffected.
- [x] Add `send_confirmed` / `hard_bounced` result handling to the campaign
      filter + a writeback path from bounce processing (`burner_list_verify.py`).
- [x] **Catch-all auto-rollout instead of the burner domain (2026-06-18).** After
      the DKIM signing fix landed, a root-cause classification of the 75k
      pre-fix bounces showed the damage was ~55% reputation/auth (which DKIM
      fixes) and only ~29% genuinely-dead mailboxes. The catch-all pool accepts
      at RCPT time by definition, so it does not user-unknown bounce at send
      time -- it is far safer to bleed directly in warmed batches than to stand
      up + warm a whole separate burner domain/IP/SPF/DKIM identity. So the
      catch-all pool is now gated by an **automatic in-house rollout** in
      `build_trucking_campaigns.py` (`catch_all_enabled()`):
        - enables only when `warmup_day() >= CAMPAIGN_CATCH_ALL_MIN_DAY` (21)
          AND the **recent** (2-day) live campaign bounce rate is below
          `CAMPAIGN_CATCH_ALL_MAX_BOUNCE_PCT` (8%) on a trustworthy sample
          (>= 300 sent);
        - **auto-reverts** to the clean `smtp_valid`/`send_confirmed` pool on the
          next run if bounces spike back above the ceiling;
        - a deliberately SHORT window so a past disaster (the Jun-16 ~45% 7-day
          rate) cannot block the rollout forever, and a fresh spike trips it fast;
        - `CAMPAIGN_INCLUDE_CATCH_ALL=1/0` still hard-overrides the auto decision.
      Applied uniformly to trucking + IFTA + UCR builders (`tc.usable_filter()`).
      The bounce-watcher continues to auto-suppress any individual hard bounces
      in real time, so PW's own bounce rate stays bounded during the rollout.
- [ ] ~~Stand up the burner verification domain + isolated MTA identity.~~
      **Dropped** -- superseded by the catch-all auto-rollout above (the burner
      was a panic-era design from before the DKIM fix + per-subscriber bounce
      tracking made an in-house controlled rollout safe). The `mx_probe_blocked`
      consumer-ISP pool (438k, highest dead-mailbox risk) is the only case where
      a burner would still help; revisit only if that pool is ever needed.
- [x] ~~Build the verification-send + bounce-writeback worker.~~ Not needed for
      catch-all (see above). `burner_list_verify.py` remains available if the
      `mx_probe_blocked` pool is ever scrubbed via a burner.