new-site/docs/vertical-lead-source-analysis.md
justin 792f5e948f docs: vertical lead-source analysis ranked by email-source reliability
Synthesize this session's findings into a ranking of candidate verticals by the
one thing that actually gates cold email: a reliable public bulk source of
deliverable emails. Tier 1 (email native): FCC, FMCSA. Tier 2 (one free hop):
healthcare ORG NPIs (already harvested 63k verified), SEC/OTC corporate. Tier 3
(domain-scrape): FMC OTI, state business entities by trigger. Tier 4 (phone/mail
only, NOT email): CLIA (0.3% match proven), EPA RCRA, individual NPIs.
2026-06-14 00:56:27 -05:00

5.2 KiB

Vertical Lead-Source Analysis: Ranked by Email Reliability

Date: 2026-06-13 Purpose: The proven bottleneck for every cold-email vertical is NOT the deficiency signal or the audience size -- it is whether a reliable, public, bulk source gives us a deliverable email (or a clean, high-yield path to one). This ranks candidate verticals by that single criterion, using what we verified this session (FCC, FMCSA work; CLIA email-match was 0.3% = dead).

The rule (learned the hard way)

A vertical is email-viable only if ONE of these is true:

  1. The public registry contains the email (FCC RMD contact_email, FMCSA carrier email_address). -> Tier 1, just send.
  2. The registry maps to a second free public source that has email by a clean key (NPI, FRN, CIK, domain). -> Tier 2, one enrichment hop.
  3. The targets reliably have websites so a domain->scrape gets email at decent yield. -> Tier 3, scrape pipeline (proxy). Otherwise it is phone / direct-mail only (CLIA, EPA RCRA, raw NPPES individuals). Still real money, just not cold email.

Tier 1 -- Email is IN the registry (send today)

Vertical Source Email field Recurring obligation Status
FCC carriers / VoIP / ISP FCC RMD, 499 filer, CORES contact_email (native) RMD annual, 499-A/Q, CPNI annual LIVE (built)
FMCSA trucking FMCSA carrier census email_address (native) MCS-150 biennial, IFTA quarterly, UCR annual LIVE (built)

These are the whole reason the business works. Nothing else is as clean.

Tier 2 -- One free public hop to email (worth building)

Vertical Registry (no email) Email source + key Yield estimate Notes
Healthcare providers (org NPIs) NPPES NPPES endpoint_pfile (Direct/email endpoints), keyed by NPI ~88k institutional emails harvested, ~63k verified ALREADY HARVESTED. The org/institutional slice has real emails (we filtered HISP/Direct gateways). Individual NPIs do NOT. Recurring: revalidation, NPPES update, OIG screening.
Public companies (OTC/SEC filers) SEC EDGAR (CIK, state of incorp, phone, addr, website) website domain -> scrape IR/contact email; or email-append Medium-high (real cos w/ IR pages) ~2,771 SEC-reporting OTC issuers; Delaware/Nevada heavy. Hook: reincorporate-to-TX, annual report, RA, franchise tax. Small but high-ticket.

Tier 3 -- Domain-scrape required (proxy pipeline; medium yield)

Vertical Registry Why scrape Yield
FMC Ocean Transportation Intermediaries (NVOCC/forwarders) FMC OTI lookup few thousand licensees, most have websites medium-high; small universe but real businesses + bonds renew
State business entities (formation/RA/foreign-qual) State SOS bulk (FL/CA/VA/TX free; Socrata) millions of entities, name+addr+officers, often a website low-medium per scrape, but HUGE universe; better to target by trigger (newly-formed, delinquent, foreign-qual)

Tier 4 -- Phone / direct-mail only (NOT cold email)

Vertical Registry Why not email Best channel
CLIA labs CMS POS CLIA file no NPI, no email; NPPES name+zip match = 0.3% (verified dead) postcard (~3,100/wk full coverage), phone
EPA RCRA hazardous-waste handlers ECHO bulk no email anywhere in ECHO phone (RCRAInfo), mail, append
NPPES individual providers NPPES individuals have phone/fax, rarely a usable org email phone, fax, web inbound

Net recommendation (where to invest next, in order)

  1. Mine the healthcare ORG emails we already harvested harder (Tier 2, zero new cost). 63k verified institutional emails -> diversify triggers beyond NPI revalidation: NPPES staleness, OIG/SAM screening, org-NPI corrections. The data is already on prod.
  2. SEC/OTC corporate (Tier 2). Small universe (~2.7k) but high-ticket (reincorporation, RA, franchise tax, foreign-qual) and a timely TX hook. EDGAR is free + bulk-OK; emails via website-domain scrape (we have the pipeline design from CLIA). Worth a pilot because the per-deal value is high.
  3. State business entities by TRIGGER (Tier 3, biggest universe). Do NOT blast all entities; target newly-formed (need RA/EIN/OA), delinquent/admin- dissolved (reinstatement), or foreign-qualification candidates. Free bulk from FL/CA/VA; email via domain-scrape. This is the largest TAM if the scrape yields.
  4. FMC OTI (Tier 3, small but clean): few thousand, website-rich, bonds renew annually. Quick win if we want another trucking-adjacent vein.
  5. CLIA / EPA RCRA: keep as phone/postcard, not email. Service + LP exist for CLIA; drive via mail to a "check your expiration" web tool that captures email.

The honest meta-point

We have spent effort proving that most government registries are email-poor. The reliable email money is: FCC + FMCSA (native), plus the healthcare org emails we already harvested. Everything else is either a scrape gamble or a phone/mail channel. Before building any new vertical, confirm its email path falls in Tier 1-2; if it is Tier 3, pilot the scrape yield FIRST (like we should have for CLIA); if Tier 4, don't pretend it is an email channel.