docs: vertical lead-source analysis ranked by email-source reliability
Synthesize this session's findings into a ranking of candidate verticals by the one thing that actually gates cold email: a reliable public bulk source of deliverable emails. Tier 1 (email native): FCC, FMCSA. Tier 2 (one free hop): healthcare ORG NPIs (already harvested 63k verified), SEC/OTC corporate. Tier 3 (domain-scrape): FMC OTI, state business entities by trigger. Tier 4 (phone/mail only, NOT email): CLIA (0.3% match proven), EPA RCRA, individual NPIs.
This commit is contained in:
parent
a2665c22c2
commit
792f5e948f
1 changed files with 80 additions and 0 deletions
80
docs/vertical-lead-source-analysis.md
Normal file
80
docs/vertical-lead-source-analysis.md
Normal file
|
|
@ -0,0 +1,80 @@
|
|||
# Vertical Lead-Source Analysis: Ranked by Email Reliability
|
||||
|
||||
**Date:** 2026-06-13
|
||||
**Purpose:** The proven bottleneck for every cold-email vertical is NOT the
|
||||
deficiency signal or the audience size -- it is whether a reliable, public, bulk
|
||||
source gives us a **deliverable email** (or a clean, high-yield path to one).
|
||||
This ranks candidate verticals by that single criterion, using what we verified
|
||||
this session (FCC, FMCSA work; CLIA email-match was 0.3% = dead).
|
||||
|
||||
## The rule (learned the hard way)
|
||||
|
||||
A vertical is **email-viable** only if ONE of these is true:
|
||||
1. The public registry **contains the email** (FCC RMD `contact_email`, FMCSA
|
||||
carrier `email_address`). -> Tier 1, just send.
|
||||
2. The registry maps to a **second free public source that has email** by a clean
|
||||
key (NPI, FRN, CIK, domain). -> Tier 2, one enrichment hop.
|
||||
3. The targets reliably have **websites** so a domain->scrape gets email at
|
||||
decent yield. -> Tier 3, scrape pipeline (proxy).
|
||||
Otherwise it is **phone / direct-mail only** (CLIA, EPA RCRA, raw NPPES
|
||||
individuals). Still real money, just not cold email.
|
||||
|
||||
## Tier 1 -- Email is IN the registry (send today)
|
||||
|
||||
| Vertical | Source | Email field | Recurring obligation | Status |
|
||||
|---|---|---|---|---|
|
||||
| **FCC carriers / VoIP / ISP** | FCC RMD, 499 filer, CORES | `contact_email` (native) | RMD annual, 499-A/Q, CPNI annual | LIVE (built) |
|
||||
| **FMCSA trucking** | FMCSA carrier census | `email_address` (native) | MCS-150 biennial, IFTA quarterly, UCR annual | LIVE (built) |
|
||||
|
||||
These are the whole reason the business works. Nothing else is as clean.
|
||||
|
||||
## Tier 2 -- One free public hop to email (worth building)
|
||||
|
||||
| Vertical | Registry (no email) | Email source + key | Yield estimate | Notes |
|
||||
|---|---|---|---|---|
|
||||
| **Healthcare providers (org NPIs)** | NPPES | NPPES **endpoint_pfile** (Direct/email endpoints), keyed by NPI | ~88k institutional emails harvested, ~63k verified | ALREADY HARVESTED. The org/institutional slice has real emails (we filtered HISP/Direct gateways). Individual NPIs do NOT. Recurring: revalidation, NPPES update, OIG screening. |
|
||||
| **Public companies (OTC/SEC filers)** | SEC EDGAR (CIK, state of incorp, phone, addr, **website**) | website domain -> scrape IR/contact email; or email-append | Medium-high (real cos w/ IR pages) | ~2,771 SEC-reporting OTC issuers; Delaware/Nevada heavy. Hook: reincorporate-to-TX, annual report, RA, franchise tax. Small but high-ticket. |
|
||||
|
||||
## Tier 3 -- Domain-scrape required (proxy pipeline; medium yield)
|
||||
|
||||
| Vertical | Registry | Why scrape | Yield |
|
||||
|---|---|---|---|
|
||||
| **FMC Ocean Transportation Intermediaries (NVOCC/forwarders)** | FMC OTI lookup | few thousand licensees, most have websites | medium-high; small universe but real businesses + bonds renew |
|
||||
| **State business entities (formation/RA/foreign-qual)** | State SOS bulk (FL/CA/VA/TX free; Socrata) | millions of entities, name+addr+officers, often a website | low-medium per scrape, but HUGE universe; better to target by trigger (newly-formed, delinquent, foreign-qual) |
|
||||
|
||||
## Tier 4 -- Phone / direct-mail only (NOT cold email)
|
||||
|
||||
| Vertical | Registry | Why not email | Best channel |
|
||||
|---|---|---|---|
|
||||
| **CLIA labs** | CMS POS CLIA file | no NPI, no email; NPPES name+zip match = **0.3%** (verified dead) | postcard (~3,100/wk full coverage), phone |
|
||||
| **EPA RCRA hazardous-waste handlers** | ECHO bulk | no email anywhere in ECHO | phone (RCRAInfo), mail, append |
|
||||
| **NPPES individual providers** | NPPES | individuals have phone/fax, rarely a usable org email | phone, fax, web inbound |
|
||||
|
||||
## Net recommendation (where to invest next, in order)
|
||||
|
||||
1. **Mine the healthcare ORG emails we already harvested harder** (Tier 2, zero
|
||||
new cost). 63k verified institutional emails -> diversify triggers beyond NPI
|
||||
revalidation: NPPES staleness, OIG/SAM screening, org-NPI corrections. The
|
||||
data is already on prod.
|
||||
2. **SEC/OTC corporate** (Tier 2). Small universe (~2.7k) but high-ticket
|
||||
(reincorporation, RA, franchise tax, foreign-qual) and a timely TX hook.
|
||||
EDGAR is free + bulk-OK; emails via website-domain scrape (we have the
|
||||
pipeline design from CLIA). Worth a pilot because the per-deal value is high.
|
||||
3. **State business entities by TRIGGER** (Tier 3, biggest universe). Do NOT
|
||||
blast all entities; target newly-formed (need RA/EIN/OA), delinquent/admin-
|
||||
dissolved (reinstatement), or foreign-qualification candidates. Free bulk from
|
||||
FL/CA/VA; email via domain-scrape. This is the largest TAM if the scrape
|
||||
yields.
|
||||
4. **FMC OTI** (Tier 3, small but clean): few thousand, website-rich, bonds renew
|
||||
annually. Quick win if we want another trucking-adjacent vein.
|
||||
5. **CLIA / EPA RCRA: keep as phone/postcard**, not email. Service + LP exist for
|
||||
CLIA; drive via mail to a "check your expiration" web tool that captures email.
|
||||
|
||||
## The honest meta-point
|
||||
|
||||
We have spent effort proving that **most government registries are email-poor.**
|
||||
The reliable email money is: FCC + FMCSA (native), plus the **healthcare org
|
||||
emails we already harvested**. Everything else is either a scrape gamble or a
|
||||
phone/mail channel. Before building any new vertical, confirm its email path
|
||||
falls in Tier 1-2; if it is Tier 3, pilot the scrape yield FIRST (like we should
|
||||
have for CLIA); if Tier 4, don't pretend it is an email channel.
|
||||
Loading…
Add table
Add a link
Reference in a new issue