Compare commits

..

2 commits

Author SHA1 Message Date
justin
285a4a087c docs: plan to close MX-exclusion gaps in trucking warmup
Analysis-only plan (no code shipped). The trucking builder's warmup excludes
big receiving operators (Google/MS/Proofpoint/...) by mx_provider, but three
holes let throttling/consumer MX through during the day<=30 window:

1. Consumer operators tagged with the "mx:" prefix (mx:yahoodns.net = 283,113
   sendable carriers, mx:icloud.com = 24,985, comcast/charter/centurylink/...)
   are NOT in BIG_MX_OPERATORS, so they slip both the exclusion and the throttle.
   These are custom domains whose MX points at Yahoo/iCloud -- invisible to the
   literal-domain blocklist, only catchable via MX tagging. Biggest hole.
2. 315,892 untagged (NULL) sendable carriers are sent to unvetted (kept by design
   for anti-starvation, but uncapped).
3. mx_tag_carriers.py is on no cron, so the NULL backlog never drains and new
   FMCSA imports stay untagged -- slowly re-opening gaps 1 and 2.

Plan proposes: CONSUMER_MX_OPERATORS set folded into exclusion+throttle (behind
the existing MAIN_SKIP_BIG_MX switch), a bounded cap on the NULL bucket, and a
daily pw-mx-tag cron. Includes live numbers, validation steps (dry-run selector
diff, no sends), and open decisions (re-introduction ramp, permanent vs warmup-
only exclusion for Yahoo/iCloud custom domains).
2026-06-19 23:55:15 -05:00
justin
98364009b0 docs: international compliance expansion plan (UK/AU/IE/NZ) + vertical portability matrix 2026-06-19 10:44:11 -05:00
2 changed files with 432 additions and 0 deletions

View file

@ -0,0 +1,281 @@
# Plan — International Compliance-Services Expansion (UK / AU / IE / NZ)
_Drafted 2026-06-19. **Planning only — nothing implemented yet.** Builds on the
US FMCSA/DOT cold-email + filing model. Sister doc to
`docs/campaign-deliverability-plan.md`, `docs/foreign-incorporation-guide.md`,
and `docs/billing.md`._
## Goal
Replicate the US "regulatory-burden compliance services" model (sell filing /
renewal / monitoring services to small operators, acquired via legal unsolicited
B2B email) in other English-speaking markets that allow cold B2B email. Target
markets, ranked: **UK ⭐, Australia ⭐, Ireland, New Zealand.** (Canada & South
Africa excluded — opt-in-only marketing law; see deliverability doc / prior memo.)
This doc answers the **two blocking questions** before any market entry:
1. **Must we be incorporated / locally registered to legally sell this service
type in-region?** (and to send the marketing)
2. **Merchant processing**: how do we take ecommerce payments from in-region
customers, and how do we remit any **fees to the authorities** on their behalf?
Everything else (template localization, list sourcing, burner sending) is
deferred to follow-up docs once these two gates are cleared.
---
## Question 1 — Do we need a local entity?
Two separate legal tests. Don't conflate them.
- **(a) To OPERATE the business** (sell a filing/agent service to locals).
- **(b) To send the MARKETING** (cold B2B email into that country).
For (b) none of these countries require a local entity — the anti-spam laws
(UK PECR, AU Spam Act, IE ePrivacy, NZ UEMA) bind on **conduct** (sender ID +
unsubscribe + B2B-consent basis), not on where the sender is incorporated. A
foreign sender can lawfully email; it just must comply with the rules. So the
entity question is really about (a): operating, contracting, and getting paid.
| Market | Local entity legally required to operate? | Reality / why |
|---|---|---|
| **UK** | **No** (can trade as overseas entity), but **strongly advised** | A foreign company can sell services into the UK. BUT: (i) UK merchant acquirers/Stripe UK want a UK or EEA entity for GBP settlement + lower fees; (ii) **VAT registration** likely required (see Q2); (iii) credibility — UK SMEs distrust a US filing agent for their O-licence. A **UK Ltd** is cheap (~£12/yr Companies House) and removes all three frictions. **No UK-resident director required.** |
| **Australia** | **No** to sell remotely; **registration triggered** if "carrying on business in Australia" | Foreign co can sell in. Once you're "carrying on business in AU" you must register as a **foreign company with ASIC (ARBN)** OR form a local **Pty Ltd**. Pty Ltd is cleaner BUT **requires at least one director who ordinarily resides in Australia** (Corporations Act s201A) — this is the real blocker; needs a resident director / nominee service. GST registration required once turnover ≥ A$75k (see Q2). |
| **Ireland** | **No**, but EU-presence helps | Foreign (incl. UK post-Brexit, US) co can sell in. An **Irish Ltd requires at least one EEA-resident director** OR a **s137 non-resident bond** (~€25k insurance bond, ~€2k/yr). VAT registration required (Q2). If we already have a UK Ltd, can often sell into IE from UK without a separate IE entity. |
| **New Zealand** | **No**, low bar to localize | Foreign co can sell in. NZ company formation is fast/cheap BUT **requires one director living in NZ (or in Australia AND a director of an AU company)** — Companies Act 1993 s10. GST registration once turnover ≥ NZ$60k. Smallest market; defer. |
### Sharper question: does the *service itself* (acting as filing agent) require a license?
This is the one to verify per-vertical before launch — being someone's agent for
a government filing can be a regulated activity.
| Market | Filing-agent licensing for transport compliance? | Notes / open item |
|---|---|---|
| **UK** | **No license to be a paid agent**, but the **Transport Manager** role on an O-licence is statutory (must hold a CPC and be a real person of repute). We can sell **prep / monitoring / renewal admin**, and optionally broker **external Transport Manager CPC holders**, but we **cannot ourselves "be" the Transport Manager** without a qualified person. **VERIFY: don't market as providing the TM unless we contract real CPC holders.** |
| **AU** | **No agent license** for NHVR/CoR advisory or NHVAS prep. NHVAS auditors must be approved, but we'd sell prep, not audit. Low risk. |
| **IE** | Same as UK (EU-harmonized: Transport Manager CPC required on the operator licence). |
| **NZ** | Transport Service Licence has a "fit and proper person" + certificate requirements; advisory/prep is unregulated. |
**Recommendation (Q1):**
1. **UK first.** Form a **UK entity** (no resident director/member needed, cheap,
unlocks Stripe UK + GBP + VAT + credibility). Sell prep/monitoring/renewal;
partner with external CPC Transport Managers rather than claiming to be one.
**Entity choice — LLP (chosen) vs Ltd:** Going with a **UK LLP** for
**tax pass-through** (no corporation tax at entity level; profits taxed in
members' hands). Trade-offs to plan around:
- **LLP needs ≥2 members** (a Ltd can be a single person). Need a second
member/designated member.
- Pass-through is **not** zero UK tax: non-UK-resident members with **UK-source
trading profit** owe **UK self-assessment**; the LLP files a **partnership
return (SA800)**. So two layers of personal filing, not entity tax.
- **US tax:** a UK LLP defaults to a **partnership** for US purposes (or
check-the-box) → flows to US members' returns; watch for extra US filing.
- **VAT obligation is identical to a Ltd** (see Q2). No saving there.
- **No UK-resident member required** for an LLP — good.
2. **AU second.** Start by **selling remotely as the existing entity** (legal) to
validate demand; only stand up a **Pty Ltd (needs resident-director nominee)**
or **ASIC ARBN** once revenue justifies it / once "carrying on business" is
triggered.
3. **IE / NZ** deferred — both need a resident-director or bond workaround and are
smaller; revisit after UK proves the playbook.
---
## Question 2 — Merchant processing & remitting fees to authorities
Two money flows, kept strictly separate (same separation we already enforce in
`docs/billing.md`: **our service fee** vs **government filing fee**):
- **Flow A — collect from customer** (ecommerce checkout, multi-currency).
- **Flow B — pay the authority** the actual government fee on the customer's behalf.
### Flow A — Collecting payment (merchant processing)
Today's stack (`api/src/routes/checkout.ts`): **Stripe Checkout (card + ACH),
PayPal Orders v2, SHKeeper crypto**; Stripe Subscriptions for recurring;
Adyen aspirational/not live. We extend this, we don't replace it.
| Market | Best acquiring approach | Currency / settlement | Notes |
|---|---|---|---|
| **UK** | **Stripe UK** under the new **UK Ltd** | Settle **GBP** to a UK/EEA business account (Wise, Airwallex, Revolut Business, or a UK high-street acct) | Lowest fees, local card success rates, supports **BACS Direct Debit** (the UK ACH analog — good for recurring monitoring subs) and local methods. PayPal UK as fallback. |
| **AU** | **Stripe AU** (needs AU entity) **or** sell via existing Stripe charging in **AUD as a foreign business** initially | AUD; settle via Airwallex/Wise AUD until Pty Ltd exists | Stripe supports AUD on a non-AU account but settlement/fees are worse; **PayID/BECS Direct Debit** need a local Stripe. Start cross-border, localize when entity lands. |
| **IE** | Stripe (UK or IE entity), **EUR**, SEPA Direct Debit | EUR | If UK Ltd exists, can run IE sales through it in EUR. |
| **NZ** | Stripe NZ (needs NZ entity) or cross-border NZD | NZD | Defer. |
**Multi-currency mechanics (low-lift path):**
- Stripe can present/settle multiple currencies on one account; quickest start is
**charge in local currency on the existing/US or new UK account**, accept FX
until per-market entities exist.
- Use **Wise Business / Airwallex** to hold GBP/AUD/EUR/NZD and avoid double FX.
- Keep **ERPNext as system of record** (multi-currency invoices already supported)
exactly as in `docs/billing.md`; add per-market price lists + tax templates.
**Surcharge note:** our card surcharge model (`docs/billing.md`) is **illegal/capped
in several of these markets** — **UK & EU cap/ban surcharges on consumer cards
(PSD2 surcharging ban); AU allows surcharge only up to actual cost of acceptance
(RBA rules).** ⚠️ **Do NOT copy the US 3% card surcharge into UK/EU/AU.** Bake
processor cost into price or absorb it there.
### Flow A.1 — Sales tax / VAT / GST on OUR service fee
This is mandatory homework, not optional. Selling services to local businesses
generally creates a tax-collection obligation.
| Market | Tax | Registration trigger | Mechanic |
|---|---|---|---|
| **UK** | **VAT 20%** | If UK-established: register at **£90k** turnover. **If we sell from a non-UK entity into UK, threshold can be £0** (non-established taxable person) → register from first sale. A UK Ltd is simpler. B2B may use **reverse charge** (customer self-accounts) which can reduce our collection burden — **VERIFY per service**. | Register for VAT, charge 20% (or reverse-charge B2B), file quarterly (MTD). |
| **AU** | **GST 10%** | Register at **A$75k** turnover (lower/zero for some non-resident supplies) | Charge 10%, remit to ATO (BAS). B2B reverse-charge may apply for non-resident suppliers. |
| **IE** | **VAT 23%** | Non-established → effectively from first B2B sale; reverse charge common for B2B | File via Revenue. |
| **NZ** | **GST 15%** | A$/NZ$60k | Defer. |
**Open item:** for **B2B** sales the **reverse charge** mechanism may mean the
*customer* accounts for VAT/GST, dramatically simplifying our obligation — but it
depends on whether the supply is "digital service" vs "professional service" and
our establishment status. **Get a one-off cross-border VAT opinion before launch.**
### Flow B — Paying the government authority on the customer's behalf
This is the operationally hard part. In the US we front/relay the filing fee. The
analog per market:
| Market | Authority + typical fee | How fees are paid | Our remittance mechanism |
|---|---|---|---|
| **UK** | **Traffic Commissioner / DVSA** — O-licence app ~£257 + ~£401 grant + ~£401/5yr; **DVSA** for MOT/tacho; **Companies House** for any co. admin | Mostly **GOV.UK online card/Direct Debit**, agent can pay on behalf | Pay via a **UK business debit card** (from the UK Ltd's bank) at GOV.UK; pass-through the exact fee to customer with no surcharge. Need a funded GBP account (Wise/Revolut/UK bank). |
| **AU** | **NHVR** (registration/accreditation), state road agencies, **ASIC** | NHVR Portal card payment; state portals | Pay via **AU business card**; needs AUD float. Until Pty Ltd, may need customer to pay authority directly while we do prep-only (avoids handling AU gov payments cross-border). |
| **IE** | **RSA** / Dept of Transport, CRO (companies) | gov.ie / RSA online card | EUR business card. |
| **NZ** | **NZTA** (TSL, RUC) | NZTA online | Defer. |
**Key design decisions for Flow B:**
1. **Pass-through, never markup, the government fee** — same rule as US billing
(surcharges apply to service fees only, not filing fees — `docs/billing.md`).
Display gov fee as a separate, at-cost line item.
**Card to pay the authorities — funding rail (decided):** GOV.UK / DVSA /
Companies House all take **Visa/Mastercard**, so we need a GBP-funded card.
Options:
- **Stripe Issuing (UK/EU): yes, virtual cards exist.** Stripe Issuing offers
**virtual + physical Visa** in the **UK and EU** (not US-only), funded from
the Stripe balance, with per-card limits. Good for **programmatic per-filing
virtual cards** later. Caveat: needs **Issuing approval/eligibility**, Visa
network only, pitched for platform/expense use — an application, not
instant-on.
- **Wise Business / Revolut Business (preferred for launch):** one product gives
**real UK account details (sort code + acct no.)** that receive
**Faster Payments / BACS / CHAPS**, PLUS **virtual + physical debit cards**,
PLUS multi-currency GBP/EUR/AUD holding. Fund GBP via **Faster Payments**
(instant, free, ~£1M cap) and pay authorities with the attached virtual card.
No prepaid card and no Stripe Issuing approval needed.
- **Transfer-rail note:** you fund an **account that has a card attached**, not a
card directly. Use **Faster Payments** for top-ups (instant/free). **CHAPS**
(£25-35) only for high-value one-offs; **BACS** (3-day batch) for Direct
Debit/payroll, not ad-hoc. Use **Stripe Issuing** only if/when we want
per-filing programmatic cards.
2. **Two models for who pays the authority:**
- **(i) We pay (agent model):** we hold a funded local-currency business card,
pay GOV.UK/NHVR directly, recoup via the customer's checkout. Best UX, needs
local banking + float + reconciliation. **UK = yes (UK Ltd + Wise/Revolut).**
- **(ii) Customer pays the authority directly (prep-only model):** we charge
only our service fee; customer enters their own card at the gov portal. **No
gov-money handling, no float, no entity needed for Flow B.** Best for AU/NZ
market-validation phase and avoids money-transmission questions.
3. **Avoid looking like a money transmitter.** Fronting third-party gov fees at
scale can edge toward regulated payment activity. Keep it as **agency
disbursement of a clearly-itemized pass-through cost**, not a stored-value /
FX product. **VERIFY threshold with counsel if volume grows.**
**Recommendation (Q2):**
- **UK:** UK Ltd → **Stripe UK (GBP, no card surcharge) + Wise/Revolut GBP
account** for both collecting (Flow A) and paying GOV.UK (Flow B, agent model).
Register for VAT. ERPNext stays system of record.
- **AU/IE/NZ:** launch **prep-only / customer-pays-authority** (Flow B model ii) on
cross-border Stripe in local currency to validate demand **before** committing to
a local entity + resident director + local acquiring.
---
## Cost / friction summary (entity + payments to launch)
| Market | Entity to operate | Hard blocker | Payments-in | Pay-authority | Verdict |
|---|---|---|---|---|---|
| **UK** ⭐ | UK Ltd (no resident dir, ~£12/yr) | VAT registration | Stripe UK / GBP, no surcharge | Agent model via GBP card | **Go first** |
| **AU** ⭐ | None to start; Pty Ltd later | Pty Ltd needs **AU-resident director** | Cross-border AUD → Stripe AU later | Prep-only first | **Go second, prep-only** |
| **IE** | UK Ltd can serve; IE Ltd needs **EEA director / €25k bond** | Director/bond | Stripe EUR | Prep-only / agent | Defer |
| **NZ** | NZ co needs **NZ/AU-resident director** | Director | Cross-border NZD | Prep-only | Defer |
---
## Open questions (need answers before build)
1. **Cross-border VAT/GST opinion** — does B2B **reverse charge** cover our service
so we don't have to collect? (UK + AU + IE). Single biggest unknown for Q2.
2. **UK LLP formation** — confirm no-resident-member is fine, **line up the
required 2nd member/designated member**, pick a registered-office/agent
provider (mirror `docs/foreign-incorporation-guide.md`). Confirm LLP
pass-through vs the extra UK SA800 + members' self-assessment + US
partnership-filing burden is acceptable vs a single-member Ltd. Banking:
**Wise vs Revolut Business vs Airwallex** for the GBP account + virtual card
(Flow B); decide whether to also apply for **Stripe Issuing** later.
3. **AU resident-director nominee** — cost/availability of a nominee director
service if/when we localize; or stick to ARBN (foreign-company) route.
4. **Money-transmission line** — confirm fronting GOV.UK fees as itemized
pass-through disbursement does not trigger payment-institution licensing at our
volumes.
5. **Transport Manager (UK/IE)** — confirm we can sell prep/monitoring without
holding the statutory TM CPC ourselves, and line up external CPC holders to
broker if we want to offer the full O-licence package.
6. **Surcharge legality** — strip the US card surcharge from all UK/EU/AU pricing;
reprice to absorb processor cost. (Confirmed needed, just needs implementation.)
7. **Vertical fit** — this doc assumes the **transport/trucking** analog (closest
to FMCSA). See the **Vertical portability matrix** below for how the rest of the
US stack ports; **healthcare does NOT port (NHS, no billing-enrollment model).**
---
## Vertical portability matrix (US stack → UK / European-English markets)
European English-speaking ≈ **UK + Ireland (+ Malta, negligible)**. Each vertical is
judged on the same two gates as the US playbook: **(1) is there a recurring
regulatory clock to sell against, and (2) can we actually get emails** (every UK
public register lists the regulated entity but **not** its email, so all of these
collapse to the same spine: **free public register × Companies House join ×
scrape-published-emails / paid append** — build it once, run all verticals on it).
| US vertical (ours) | UK/EU analog | Recurring clock? | Email/data path | Verdict |
|---|---|---|---|---|
| **Formation + annual report + registered agent** | **Companies House**: formation, **confirmation statement (annual)**, registered office, **ECCT identity verification** (2025) | ✅ annual | Companies House **free bulk register** (no email) → enrich | ⭐ Best 1:1 transfer; ~5M cos; **but saturated** (1stFormations/Tide) |
| **TCPA / data-privacy** | **ICO data protection fee** — every UK business processing personal data pays £40£2,900/yr; **PECR** is the marketing law itself | ✅ **mandatory annual**, widely missed | **ICO public register** (name+status, no email) → can flag the *unregistered* → enrich | ⭐ **Sleeper / lead UK product.** Mandatory, recurring, under-served, we already operate under this law |
| **Trucking / FMCSA** | **O-licence** (Traffic Commissioner/DVSA) | ✅ 5-yr + ongoing | O-licence register (no email) × Companies House × scrape | ⭐ Main plan; ~85k UK + ~4k IE operators |
| **EPA RCRA hazardous waste** | **Environment Agency** waste carrier/broker registration (renew **3-yr**) + hazardous waste producer | ✅ 3-yr | EA public carrier/broker register (limited contact) → enrich | ✅ Decent niche, clear clock, public register |
| **Employment / contractor classification** | **IR35 / off-payroll working**, worker status | ⚠️ event-driven, no registry | no registry; reach via contractor/accountant channels | ⚠️ Real pain but **not list/cold-email driven** → inbound/content |
| **Telecom (CRTC / FCC 499 / USAC)** | **Ofcom** comms-provider notification + annual admin charge; **CCTS→ADR (CISAS/Ombudsman)** | ✅ annual admin charge | Ofcom lists exist, **no rich email register** like FCC RMD | ⚠️ Small universe, weak data, niche |
| **FMC ocean (NVOCC/forwarders)** | **BIFA membership, AEO, CDS customs** | ⚠️ mostly one-time/voluntary | BIFA member list, no clean email feed | ⚠️ Niche, weak clock |
| **Healthcare (Medicare/PECOS/Medicaid/CLIA/DEA)** | **NHS single-payer kills the billing-enrollment model.** Only **GMC revalidation (5yr)/NMC (3yr)** + **CQC provider registration** map | ⚠️ revalidation is **personal attestation** | GMC/NMC registers (no email); CQC has provider contact | ❌ **Worst transfer — skip.** No Medicare-enrollment analog; don't spend burner infra here |
### Takeaways
1. **Two verticals beat trucking for the UK launch:**
- **Companies House corporate services** — most direct transfer of our entire
formation/RA/annual-report engine, but the most crowded market.
- **ICO data protection fee** ⭐ — the sleeper: mandatory + recurring + widely
ignored, the public register lets us target the **non-compliant**, per-deal
value is small but volume is enormous, and we already understand PECR.
2. **Healthcare does NOT port** — entire US healthcare stack assumes fee-for-service
billing the NHS doesn't have. Exclude from UK/IE.
3. **One enrichment spine serves all** — Companies House-anchored verticals
(corporate, ICO, trucking, waste) are all **Tier-2 "one hop to email"** (per
`docs/vertical-lead-source-analysis.md`); telecom/FMC/healthcare are Tier-3/4.
4. **Lead UK products:** **ICO data-protection-fee + Companies House corporate
services**, alongside the **O-licence** trucking stream.
## Next docs (after Q1/Q2 cleared)
- `plan.uk-olicence-stream.md` — UK Traffic Commissioner O-licence product,
template localization, Companies House entity-type segmentation (Ltd/LLP/PLC =
legal cold B2B; sole traders/partnerships = need soft opt-in).
- `plan.au-nhvr-stream.md` — NHVR / Chain of Responsibility, inferred-consent list
sourcing from published business addresses.
- `plan.uk-ico-fee-stream.md` — ICO data-protection-fee renewal product; target the
unregistered/lapsed from the ICO public register; PECR-compliant outreach.
- `plan.uk-companies-house-stream.md` — confirmation statement + registered office +
ECCT identity verification; the enrichment spine (Companies House bulk × SIC).

View file

@ -0,0 +1,151 @@
# Plan: close the MX-exclusion gaps in the trucking warmup
**Status:** PROPOSED (2026-06-20). Analysis + design only; no code shipped yet.
**Owner context:** warmup day 17; big operators (Google/Microsoft/Proofpoint/
Mimecast/Barracuda/Cisco/Broadcom) are EXCLUDED until day 30, then re-introduced
via `mx_daily_caps()`. This plan fixes three holes that let throttling/consumer
MX operators through during that window.
---
## Background: how the two MX layers work today
Sender reputation is judged by the **receiving operator (MX)**, not the recipient
domain string. There are two independent gates in `scripts/build_trucking_campaigns.py`:
1. **`fetch_carriers()` big-MX EXCLUSION** (SQL `big_mx_exclude`): during warmup
(`main_warmup_day() <= MAIN_BIG_MX_EXCLUDE_UNTIL_DAY`, currently day 30) it
drops carriers whose `mx_provider IN BIG_MX_OPERATORS`. `mx_provider IS NULL`
is deliberately KEPT (so the pool isn't starved before tagging completes).
2. **`select_sendable_carriers()` per-MX THROTTLE** (`mx_daily_caps` +
`per_op` cap): bounds how many of a run's quota go to each KNOWN operator so
we never concentrate on one. NULL is NOT capped (would collapse onto one
bucket and starve the pool).
`mx_provider` is populated by `scripts/mx_tag_carriers.py`, which resolves each
domain's MX and returns either a **clean label** (`google`, `microsoft`,
`proofpoint`, `mimecast`, `cisco`, `barracuda`, `broadcom`, `godaddy`, `zoho`,
`rackspace`) or, for everything else, an **`mx:<root-domain>` prefix** (e.g.
`mx:yahoodns.net`, `mx:icloud.com`, `mx:comcast.net`).
---
## The three gaps (with live numbers, 2026-06-20)
### Gap 1 — consumer/throttling MX behind the `mx:` prefix are NOT excluded
`BIG_MX_OPERATORS` only lists the clean labels. The big consumer mailbox
operators get tagged with the `mx:` prefix and so slip BOTH gates during warmup:
| mx_provider | sendable carriers | why it's a problem |
| --- | --- | --- |
| `mx:yahoodns.net` | **283,113** | Yahoo Small Business / AOL custom domains — same aggressive consumer filtering + complaint-driven blocking as consumer Yahoo. By far the biggest hole. |
| `mx:icloud.com` | **24,985** | Apple iCloud+ Custom Domain — Apple consumer filtering; iCloud was the biggest consumer leak we already scrubbed from Listmonk. |
| `mx:comcast.net` | 12,251 | Comcast consumer infra; historically bouncy. |
| `mx:charter.net` | 5,860 | Spectrum/Charter consumer. |
| `mx:centurylink.net` / `mx:windstream.net` / `mx:tds.net` / `mx:earthlink-vadesecure.net` | ~8,100 | Legacy/satellite ISP consumer mail; many already in `DEAD_ISP_DOMAINS` as literal domains but NOT caught when a custom domain points its MX there. |
`mx:yahoodns.net` alone is **283k** carriers that look "long-tail/safe" to the
warmup but actually filter like a big operator. This is the headline fix.
> NOTE: the literal-domain layer (`BLOCKED_EMAIL_DOMAINS` incl. the Yahoo family,
> Apple, dead ISPs) already blocks `someone@yahoo.com` / `@icloud.com`. The hole
> is a **custom domain whose MX points at Yahoo/iCloud** — invisible to the
> string layer, only visible via MX tagging. That's exactly what this closes.
### Gap 2 — 315,892 untagged (NULL) carriers are sent to unvetted
`mx_provider IS NULL` is kept by both gates by design (anti-starvation). With
**315,892** sendable NULLs vs 1,187,054 tagged, a meaningful slice of every run
goes to domains we've never MX-resolved — some of which are Google/MS/Yahoo we'd
otherwise exclude. This is acceptable as a bootstrap but should shrink over time.
### Gap 3 — `mx_tag_carriers.py` is not on a cron
There is no `infra/cron/pw-mx-tag` (confirmed: no cron references it). So the NULL
backlog only shrinks when someone runs it by hand. New carriers imported by the
FMCSA census downloader land as NULL and stay NULL. Without continuous tagging,
Gaps 1 and 2 slowly re-open.
---
## Proposed fixes
### Fix 1 — exclude consumer/throttling `mx:` operators during warmup (HIGH)
Add an explicit set of `mx:`-prefixed operators that should be treated like the
big operators during warmup, and fold them into BOTH the exclusion and the
throttle. Keep it data-driven and documented.
```python
# scripts/build_trucking_campaigns.py
# Consumer / aggressively-filtering mailbox operators that mx_tag_carriers.py
# labels with the "mx:" prefix (no clean label). They throttle/complaint-block
# like the big operators, so hold them out during warmup too. (yahoodns =
# Yahoo Small Business + AOL custom domains; icloud = Apple custom domains.)
CONSUMER_MX_OPERATORS = (
"mx:yahoodns.net", "mx:icloud.com", "mx:comcast.net", "mx:charter.net",
"mx:centurylink.net", "mx:windstream.net", "mx:tds.net",
"mx:earthlink-vadesecure.net",
)
# Everything held out of the warmup pool entirely (until MAIN_BIG_MX_EXCLUDE_UNTIL_DAY).
WARMUP_EXCLUDE_OPERATORS = BIG_MX_OPERATORS + CONSUMER_MX_OPERATORS
```
- In `fetch_carriers()`: build `big_mx_exclude` from `WARMUP_EXCLUDE_OPERATORS`
(not just `BIG_MX_OPERATORS`).
- In `mx_daily_caps()`: give `CONSUMER_MX_OPERATORS` the same `big` ramp as the
clean big operators after day 30 (so they re-introduce gradually, not all at
once on day 31).
- Keep it behind the existing `MAIN_SKIP_BIG_MX` switch so it's reversible.
**Effect:** removes ~330k consumer-MX carriers from the warmup-window pool; the
long tail of genuinely small/self-hosted systems carries the volume, which is the
whole point of the warmup strategy.
### Fix 2 — bound the NULL bucket with a small cap (MEDIUM)
Don't exclude NULL (still anti-starvation), but give it a real per-run cap in
`select_sendable_carriers()` instead of "uncapped". E.g. treat unknown/NULL like
`__default__` but at a fraction (say 40/run) so an untagged Google/Yahoo domain
can't flood a run. Pairs with Fix 3 (continuous tagging) to shrink the bucket.
### Fix 3 — put `mx_tag_carriers.py` on a daily cron (MEDIUM)
Add `infra/cron/pw-mx-tag` (model on `pw-listmonk-scrub`) running e.g. 05:45 UTC
(before the 08:00 trucking builder), tagging the next N thousand NULL domains/day:
```
45 5 * * * deploy cd /opt/performancewest && docker compose exec -T workers \
python3 -m scripts.mx_tag_carriers --limit-domains 20000 \
>> /var/log/pw-mx-tag.log 2>&1
```
Install to `/etc/cron.d/` (deploy.sh doesn't run ansible). This continuously
shrinks the 315k NULL backlog and keeps newly-imported carriers tagged, so Fixes
1 & 2 stay effective.
---
## Validation plan (verify before/after, no sends triggered)
1. **Dry-run the selector** before/after Fix 1 and diff the per-MX composition of
a simulated run (the builder has `list_segments()` / quota selection paths that
can be exercised read-only). Assert 0 carriers from `CONSUMER_MX_OPERATORS`
are selected while `main_warmup_day() <= 30`.
2. **SQL sanity:** `SELECT mx_provider, count(*) ... WHERE listmonk_sent_at IS NULL
GROUP BY 1` — confirm the excluded operators drop out of the candidate pool.
3. **Cron (Fix 3):** run `mx_tag_carriers --limit-domains 1000` once by hand,
confirm the NULL count falls and no errors; then install the cron and confirm
the next-day count fell again (idempotent, bounded).
4. **Regression:** confirm the long-tail pool is still large enough to hit daily
quota at warmup caps (so we don't starve the send). If the long tail is too
small after excluding 330k consumer-MX, that's a signal to either lower the
daily quota or accept a smaller controlled slice of one consumer operator.
---
## Open questions / decisions for owner
- **Re-introduction after day 30:** treat `CONSUMER_MX_OPERATORS` identically to
the big operators (same ramp), or keep Yahoo/iCloud custom domains excluded
*longer* (they convert worse and complain more)? Recommendation: same ramp, but
watch the reputation monitor's per-operator reject% and pull back if Yahoo
spikes.
- **NULL cap size (Fix 2):** 40/run is a guess; tune against how fast Fix 3 drains
the backlog.
- **Should `mx:` consumer exclusion be permanent (not just warmup)?** For a
B2B compliance product, a carrier reachable only at a Yahoo/iCloud custom
domain is a low-value, high-complaint segment regardless of warmup. Worth
considering a permanent down-weight, not just a warmup hold.