new-site/docs/new-sector-compliance-targets.md

513 lines
27 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# New Compliance Sectors — Detectable Signals + Contact Channels
Companion to the FCC RMD and FMCSA/trucking playbooks. The winning pattern is:
a public government registry + a per-record recurring obligation + an automated
deficiency check + outreach to the operator. This doc covers the three best next
sectors and, critically, **how to reach the license holders besides postal mail.**
> Honesty note on email: unlike FCC RMD (`contact_email`) and FMCSA (carrier
> email), these three registries are **address/phone-rich but email-poor**. The
> deficiency engine still works; the channel is the hard part. Section 4 solves
> that.
---
## 1. NPPES / Healthcare Providers (NPI)
**Source:** CMS NPPES monthly full-replacement dissemination file (free bulk CSV,
~10M rows). Verified live against `npidata_pfile_20050523-20260510.csv`
(`download.cms.gov/nppes/`). Cross-joinable with OIG LEIE (exclusions) and the
CMS revalidation list, both free.
**Email in file:****VERIFIED — no email field exists** (file has 104 columns;
none is email). Contact info available: **mailing + practice TELEPHONE
(cols 27, 35), mailing + practice FAX (cols 28, 36)**, full mailing + practice
addresses, and Authorized Official telephone (col 47). So channel = fax, phone,
mail, or email-append. Not email-native.
### Verified columns we care about (104-col file)
| Col # | Field (exact) |
|---|---|
| 1, 2 | NPI, Entity Type Code (1=individual, 2=org) |
| 511 | Org legal name / provider name + credential |
| 2128 | Mailing address, **mailing telephone (27)**, **mailing fax (28)** |
| 2936 | Practice location address, **practice telephone (35)**, **practice fax (36)** |
| 37 | Provider Enumeration Date |
| 38 | Last Update Date |
| 39, 40, 41 | NPI Deactivation Reason Code, Deactivation Date, Reactivation Date |
| 4347 | Authorized Official name/title + **telephone (47)** |
| 48103 | Up to **15× {Taxonomy Code, License Number, License State Code, Primary Taxonomy Switch}** |
> Note: the public file does NOT contain a "Is Sole Proprietor" or EIN-validated
> field in a usable way (EIN col 4 is usually masked). Earlier guess corrected.
### Detectable from the file (verified)
| Signal | Field(s) | Obligation | Service |
|---|---|---|---|
| Stale `Last Update Date` (>12 yrs) | col 38 | NPPES update within 30 days of any change | NPPES refresh/attestation |
| Deactivated NPI | cols 3941 | Deactivated NPI can't bill | NPI reactivation |
| Old enumeration + never updated | col 37 vs 38 | Likely overdue Medicare revalidation (5-yr) | PECOS revalidation |
| Taxonomy w/ license but no license-state | taxonomy/license/state sets | License/specialty inconsistency | License/taxonomy reconcile |
| No primary taxonomy flagged (switch all N) | Primary Taxonomy Switch_n | Billing/credentialing errors | Taxonomy cleanup |
| Org (Type 2) missing Authorized Official | cols 2, 4347 | Incomplete org NPI | Org NPI correction |
**Inferable only (not in file):** exact revalidation due date (PECOS), HIPAA
posture, active billing, sanctions (use OIG LEIE join), email.
**Best cross-join hook:** NPPES ⨝ OIG LEIE ⨝ CMS revalidation list.
---
## 2. FMC Ocean Transportation Intermediaries (OTI: NVOCC + freight forwarders)
**Source:** FMC OTI lookup (per-record web lookup; a few thousand licensees).
Closest analog to FCC RMD in size and clock.
**Email in record:** Inconsistent — sometimes present, often not. Partial coverage.
### Detectable from the record
| Signal | Field(s) | Obligation | Service |
|---|---|---|---|
| License issue ≥ ~3 yrs ago | issue/license date | **Triennial renewal** (every 3 yrs) | OTI renewal filing |
| Bond below current minimum | financial responsibility | $75k NVOCC / $50k forwarder bond | Bond placement/review |
| Missing proof of bond | financial responsibility status | Required to operate | Bond compliance |
| QI stale/absent | qualifying individual | OTI must have a qualified QI | QI / Form FMC-18 update |
| NVOCC w/o tariff indicator | cross-ref tariff systems | NVOCCs must publish tariffs / SARs | Tariff publication setup |
| Status inactive/revoked/surrendered | license status | Operating lapsed = penalties | Reinstatement |
**Inferable only:** exact renewal due date, whether tariff actually published
(separate tariff registry), email when absent.
---
## 3. EPA RCRA Hazardous Waste Handlers (via ECHO / RCRAInfo / FRS)
**Source:** ECHO bulk files (`echo.epa.gov/files/echodownloads/`) — verified live.
Two relevant downloads:
- **`ECHO_EXPORTER`** (137 cols) — one row per facility across all programs, holds
the compliance signals. Column dict: `echo_exporter_columns_*.xlsx`.
- **`rcra_downloads.zip`** — 6 RCRA-specific CSVs: `RCRA_FACILITIES.csv` (15 cols),
`RCRA_VIOLATIONS.csv`, `RCRA_EVALUATIONS.csv`, `RCRA_ENFORCEMENTS.csv`,
`RCRA_NAICS.csv`, `RCRA_VIOSNC_HISTORY.csv`.
**Email in file:****VERIFIED — no email anywhere in ECHO bulk.**
`RCRA_FACILITIES.csv` has only: `ID_NUMBER, FACILITY_NAME, ACTIVITY_LOCATION,
FULL_ENFORCEMENT, HREPORT_UNIVERSE_RECORD, STREET_ADDRESS, CITY_NAME, STATE_CODE,
ZIP_CODE, LATITUDE83, LONGITUDE83, FED_WASTE_GENERATOR, TRANSPORTER, ACTIVE_SITE,
OPERATING_TSDF`. **No contact name, no phone, no email** in ECHO RCRA. Owner/
operator contact NAME + PHONE (still no email) exists only in the deeper RCRAInfo
handler download (`rcrapublic.epa.gov`), where a PHONE field is present.
So channel = phone (from RCRAInfo) + mail + email-append. Not email-native.
### Verified ECHO_EXPORTER RCRA signal columns
`RCRA_FLAG`, `RCRA_IDS`, `RCRA_PERMIT_TYPES`, `RCRA_NAICS`,
`RCRA_INSPECTION_COUNT`, `RCRA_DAYS_LAST_EVALUATION`, `RCRA_INFORMAL_COUNT`,
`RCRA_FORMAL_ACTION_COUNT`, `RCRA_DATE_LAST_FORMAL_ACTION`, `RCRA_PENALTIES`,
`RCRA_LAST_PENALTY_DATE`, `RCRA_LAST_PENALTY_AMT`, `RCRA_QTRS_WITH_NC`,
`RCRA_COMPLIANCE_STATUS`, `RCRA_SNC_FLAG`, `RCRA_3YR_COMPL_QTRS_HISTORY`. Plus
facility-level: `FAC_DATE_LAST_INSPECTION`, `FAC_SNC_FLG`, `FAC_COMPLIANCE_STATUS`.
### Detectable from the data (verified)
| Signal | Field(s) | Obligation | Service |
|---|---|---|---|
| Generator status (LQG/SQG/VSQG) | `FED_WASTE_GENERATOR` (1/2/3/N), `RCRA_PERMIT_TYPES` | Biennial report + manifest + training | Generator program |
| Open/current violation | `RCRA_COMPLIANCE_STATUS`, `RCRA_QTRS_WITH_NC` | Return-to-compliance | Violation remediation |
| SNC flag | `RCRA_SNC_FLAG`, `FAC_SNC_FLG` | High enforcement priority | Audit prep + corrective |
| Old/never evaluated + LQG | `RCRA_DAYS_LAST_EVALUATION`, `FAC_DATE_LAST_INSPECTION` | Overdue inspection risk | Self-audit |
| Recent penalty / formal action | `RCRA_PENALTIES`, `RCRA_DATE_LAST_FORMAL_ACTION` | Active enforcement | Remediation/defense |
| TSDF without active permit | `OPERATING_TSDF`, `RCRA_PERMIT_TYPES` | TSDF permit renewal | Permit renewal |
| NAICS implies waste, no RCRA ID | `RCRA_NAICS` / FRS NAICS w/o `RCRA_FLAG` | Should be registered as generator | Generator registration |
| Cross-program: RCRA + TRI reporter | `RCRA_FLAG` + `TRI_FLAG` | EPCRA/Tier II overlap | Tier II / SPCC filing |
**Inferable only (not in file):** biennial-report-not-filed status (need RCRAInfo
BR module, not in ECHO bulk), SPCC plan existence, actual chemical inventory,
contact email. (Earlier "biennial flag" claim corrected — ECHO bulk does not
expose a clean biennial-filed flag.)
**Cross-join opportunity:** ECHO_EXPORTER `RCRA_FLAG` + `TRI_FLAG` + `FAC_NAICS_CODES`
to find facilities that should be reporting but aren't.
---
## 4. How to Contact License Holders (Besides Postal Mail)
The registries above give us name + entity + address + phone (+ sometimes fax).
Ranked options to reach them on cheaper/faster channels:
### A. Email append (turn address/phone into email)
- **B2B email-append vendors** (e.g. data providers that match company name +
address → business email): bulk match files, pay per match. Best for NPPES org
records and EPA facilities (real businesses).
- **Domain inference + verification:** derive likely domain from business name /
website, generate `info@`, `first.last@`, etc., then run an email-verification
API (SMTP/MX validation) to keep only deliverable addresses. Cheap, scalable,
works well where the entity has a website.
- **Website-scrape enrichment:** for each entity, find the website (search by
name+city), scrape contact/`mailto:` and `/contact` pages for published
business email. High accuracy when a site exists.
- **People/B2B data APIs** keyed on the **Authorized Official / Qualifying
Individual / facility contact name** we already have from the registry.
### B. Phone (we already have it in all three)
- **Cold call** the listed phone — these registries reliably include phone.
- **Ringless voicemail / voicemail drop** to the listed number.
- **SMS** to numbers that resolve to mobile (carrier-lookup the phone first;
honor TCPA/DNC — we already run DNC compliance services, so scrub against the
NDNC and keep consent records). This is the channel we must be most careful on.
### C. Fax (underrated for NPPES + EPA)
- NPPES and many EPA records include **fax**. Compliance/medical/industrial
audiences still read fax. Cheap blast, low competition, novelty cut-through.
### D. Web / digital, no contact info needed
- **Free public lookup tool** (like `/tools/dot-compliance-check`): e.g.
`/tools/npi-compliance-check`, `/tools/oti-renewal-check`,
`/tools/rcra-compliance-check`. Drives inbound; the provider searches their own
NPI/license/EPA ID and self-identifies. Pair with SEO + paid search on
"NPI revalidation", "FMC license renewal", "RCRA biennial report".
- **Retargeting / lookalike audiences:** upload the matched-email or hashed
contact list to ad platforms for display/social retargeting even without
reaching the inbox.
- **LinkedIn / Sales Navigator outreach** keyed on the Authorized Official / QI
name (especially good for FMC OTIs and EPA facility EHS managers).
### E. Channel-fit by sector
| Sector | Phone | Fax | Email-append quality | Web/SEO inbound |
|---|---|---|---|---|
| NPPES (NPI) | ✅ strong | ✅ good | Medium (org > individual) | ✅ "NPI revalidation" |
| FMC OTI | ✅ strong | ⚠️ some | Medium-high (have websites) | ✅ "FMC license renewal" |
| EPA RCRA | ✅ strong | ⚠️ some | High (real businesses + EHS contact) | ✅ "RCRA biennial report" |
### Compliance guardrails for these channels
- **TCPA/DNC:** scrub all phone/SMS against DNC, prefer manual-dial or established
business relationship, keep consent/records. (We already sell DNC compliance —
practice what we preach.)
- **CAN-SPAM:** appended emails must carry unsubscribe + physical address (our
Listmonk templates already do).
- **State telemarketing & fax (TCPA/JFPA):** fax blasting has its own rules; treat
as opt-out-respecting and B2B-only.
---
## Recommendation / Sequencing
1. **FMC OTI first** — cleanest RMD analog (small set, 3-yr clock, bond math),
some email already present, businesses with websites = easy email-append.
2. **EPA RCRA** — best deficiency richness + highest fine fear = best conversion;
reach via email-append + phone + free lookup tool.
3. **NPPES** — biggest volume, but email-poor and individual-heavy; lead with a
free NPI revalidation lookup tool + fax + org-targeted email-append.
> If email-native outreach (like FCC RMD) is the hard requirement, the better
> targets are state license boards (contractors/CSLB, insurance producers, NMLS,
> cannabis/ABC) that publish licensee email directly. Worth a separate survey.
---
## 5. Postcard Print-and-Mail Vendor Pricing (Lob vs PostGrid vs Click2Mail)
For our use case (targeted list of named license holders with mailing addresses,
personalized + QR to the free compliance-check tool, API-driven like Listmonk),
a print-and-mail API is the right tool — no USPS permit, no presort, no BMEU
drop-off. Per-piece is all-in (printing + postage + address verification).
Rates verified from live pricing pages on the session date. Confirm before
committing volume; these vendors change rates and gate some behind sales.
### Lob — VERIFIED from lob.com/pricing/print-mail
4x6 postcard, "starting at" per-piece by plan tier (volume + automation lower it):
| Plan | Monthly fee | 4x6 postcard from | Notes |
|---|---|---|---|
| Developer (free) | $0 | **$0.872 / postcard** | up to 5 templates, low limits, good for testing |
| Growth | **$260/mo** | **$0.612 / postcard** | 10 HTML templates, address verif. included tiers |
| Enterprise | **$550/mo** | **$0.582 / postcard** | 25 templates, custom volume, lowest per-piece |
- Letters from ~$0.806 / check ~$1.159 (for reference).
- Address verification (US) ~$0.05/lookup pay-go, cheaper bundled.
- Strong API + webhooks (in-transit tracking), best docs of the three.
- Best fit if we want to wire postcards into the existing pipeline cleanly.
### PostGrid — quote/login gated (not publicly itemized)
- Pricing page does not publish per-piece; requires signup/sales for the rate
card. Historically positioned **slightly under Lob per-piece** (~$0.50$0.80
range for 4x6 depending on volume) with pay-as-you-go + no mandatory monthly
minimum on the entry tier.
- US + Canada print/mail, address verification (CASS/NCOA, SERP for Canada),
good API. Often pitched as the cheaper Lob alternative.
- **Action:** get an actual quote keyed to our expected monthly volume before
choosing — the public "slightly cheaper" claim is unverified here.
### Click2Mail — login/quote gated, but the cheapest at low volume historically
- Public pages (postcards / cost-calculator) gate exact rates behind an account;
rates depend on size (4x6, 4x9, 5x8, 6x9, 6x11), class (First-Class vs
Standard/Marketing Mail), and quantity.
- Historically the **lowest all-in for small/medium runs** (often well under
$0.50/4x6 First-Class at modest volume) because it's a self-serve mail house,
not a developer-API-first platform. Has an API but less polished than Lob.
- **Action:** run their cost calculator logged in for a real number on a sample
4x6 First-Class run of ~1,000 and ~5,000 pieces.
### Practical comparison
| | Lob | PostGrid | Click2Mail |
|---|---|---|---|
| Per-piece 4x6 (4x6) | $0.58$0.87 (verified) | ~$0.50$0.80 (quote) | typically lowest at low vol (quote) |
| Monthly fee | $0 / $260 / $550 | none on entry (quote) | none (self-serve) |
| API quality | Best | Good | Basic |
| Address verif (CASS/NCOA) | Yes | Yes | Yes |
| Best for | API-wired campaigns at scale | cheaper Lob-style API | cheapest small/medium runs |
### Recommendation
- **Start on Click2Mail (or Lob Developer free tier) for the first test batch** to
validate conversion at the lowest fixed cost — no $260/$550 monthly commitment.
- **Move to Lob Growth/Enterprise (or a PostGrid quote) once volume justifies the
per-piece drop** (~the monthly fee pays for itself only at several thousand
pieces/month). Lob's API is the cleanest to wire into our existing pipeline.
- Always personalize with the detected deficiency + a **QR code / short URL** to
the relevant free lookup tool — that QR is our tracking + conversion bridge,
the same role the email CTA plays today.
---
## 6. NPI Compliance Programs, "Expired" Signals & Suggested Rates
### What we actually know is expired/dead (honest breakdown)
NPPES alone has **no license/cert/revalidation expiry date**. The only hard
"dead" status in the file is **NPI deactivation**. Real dateable "expired"
signals come from FREE companion databases joined to NPPES by NPI or name.
| Source | What it proves is expired/wrong | Hook |
|---|---|---|
| NPPES (deactivation cols 39-41) | NPI deactivated — cannot bill | NPI reactivation (HARD) |
| NPPES (Last Update col 38) | Record stale (not "expired", a nudge) | NPPES update |
| **CMS PECOS Revalidation list** | **Medicare revalidation due/overdue date (5-yr)** — the real dateable hook | Revalidation filing (flagship) |
| **OIG LEIE** | Provider EXCLUDED from federal programs | Exclusion remediation (urgent) |
| **SAM.gov exclusions** | Debarred / additional exclusions | Screening + remediation |
| State medical board lookups | License itself expired (not in NPPES) | License renewal |
> Flagship analog to FCC RMD recertification = **CMS Medicare revalidation due
> date**, joined to NPPES by NPI. That is the genuine "your X expired" signal.
### Programs to sell (ranked by trigger defensibility)
1. Medicare PECOS revalidation filing (flagship)
2. NPI reactivation (hard NPPES signal)
3. NPPES data update / attestation
4. State license renewal monitoring / filing
5. OIG/SAM exclusion screening + remediation
6. CAQH profile attestation (re-attest ~every 120 days)
7. HIPAA compliance package (universal, not detectable)
8. Credentialing / re-credentialing with payers
9. Taxonomy / enrollment cleanup
### Suggested rates
| Service | Price | Cadence | Notes |
|---|---|---|---|
| NPPES data update / attestation | **$149** | one-time | low-friction entry product |
| NPI reactivation | **$249** | one-time | hard trigger |
| Medicare PECOS revalidation filing | **$399** | every 5 yrs | flagship, high stakes |
| State license renewal (per license) | **$149/license** | annual/biennial | recurring |
| OIG/SAM exclusion screening | **$99/yr** ($19/mo) | recurring | sticky subscription |
| CAQH attestation/maintenance | **$249/yr** | recurring | high-churn pain |
| HIPAA compliance package | **$799$1,499** | one-time + annual | biggest ticket |
| Credentialing (per payer) | **$199/payer** | as needed | volume add-on |
| **Provider Compliance Bundle** | **$599$899/yr** | annual subscription | revalidation watch + exclusion screening + NPPES upkeep |
Pricing logic: solo/small providers are price-sensitive, but fear of losing
Medicare billing privileges (revalidation, exclusion) supports premium pricing on
those two. Data-update products stay cheap as door-openers. The **annual bundle**
is the goal — mirrors the trucking compliance-bundle model for recurring revenue.
### Recommendation
Lead with **Medicare revalidation** (real dateable expiry from the free CMS list,
like FCC RMD recert), use **NPI-deactivated** + **stale-NPPES** as secondary
triggers, package into a **$599-899/yr Provider Compliance Bundle**.
---
## 7. Companion Databases — VERIFIED (downloaded & inspected)
All free, all joinable to NPPES by **NPI**. Counts below are from the live files
pulled on the session date. This is the data that turns "stale record" into a
real, dateable "your X expired" hook.
### 7.1 CMS Revalidation Due Date List (the flagship)
`revalidation_base.csv`**~2.9M rows, 2.42M distinct NPIs.**
Columns: Enrollment ID, **National Provider Identifier (NPI)**, First/Last Name,
Organization Name, Enrollment State Code, Enrollment Type, Provider Type Text,
Enrollment Specialty, **Revalidation Due Date**, **Adjusted Due Date**,
Individual Total Reassign To, Receiving Benefits Reassignment.
**Verified population:**
- **261,878 enrollments have a concrete due date set** (rest are "TBD" = CMS
hasn't scheduled them yet).
- **217,968 are PAST DUE (overdue revalidation)** — these are the hottest leads.
- 43,910 are upcoming (future-dated) — perfect for "due soon" pre-emptive offers.
This is the direct analog to the FCC RMD recertification date. Sell **Medicare
PECOS revalidation filing ($399)** to the 217,968 overdue + watch service to the
upcoming ones. Join to NPPES to get their address/phone for outreach.
### 7.2 OIG LEIE (Exclusions)
`UPDATED.csv`**83,256 excluded providers/entities.**
Columns: LASTNAME, FIRSTNAME, MIDNAME, BUSNAME, GENERAL, SPECIALTY, UPIN, **NPI**,
DOB, ADDRESS, CITY, STATE, ZIP, EXCLTYPE, EXCLDATE, REINDATE, WAIVERDATE, WVRSTATE.
**Verified:** only **8,608 have a valid joinable NPI** (most exclusions predate
NPI or are entities w/ `0000000000`). Use this two ways:
- As a **screening product** sold to OTHER providers ($99/yr) — "we check you and
your staff against LEIE monthly."
- As a **remediation hook** to the 8,608 excluded-with-NPI (reinstatement help),
though excluded providers are a harder, riskier audience.
### 7.3 Medicare Opt-Out Affidavits
`OptOut_*.csv`**56,300 opt-out affidavits.**
Columns: First/Last Name, **npi**, Specialty, **Optout Effective Date**,
**Optout End Date**, address, City, State, Zip, Eligible to Order and Refer,
Last updated.
**Verified:** **22,379 have an opt-out period ending within 12 months.** Opt-out
auto-renews every 2 years unless cancelled — a real dateable event. Sell
**opt-out renewal / re-enrollment decision support**.
### 7.4 Order & Referring File
`OrderReferring_*.csv`**~2.0M rows.**
Columns: **NPI**, LAST_NAME, FIRST_NAME, PARTB, DME, HHA, PMD, HOSPICE (Y/N flags).
Tells us which providers are eligible to order/refer for each program. Use to
**qualify leads** (a provider missing eligibility they should have = enrollment
gap = service opportunity).
### 7.5 PPEF Public Provider Enrollment
`PPEF_Enrollment_Extract_*.csv`**~2.98M rows.**
Columns: **NPI**, MULTIPLE_NPI_FLAG, PECOS_ASCT_CNTL_ID, ENRLMT_ID,
PROVIDER_TYPE_CD/DESC, STATE_CD, names, ORG_NAME. This is the authoritative
"who is actively enrolled in Medicare" list. Join against NPPES to find:
- NPPES providers **NOT in PPEF** = not Medicare-enrolled (enrollment opportunity).
- Cross-check enrollment state vs NPPES practice state (mismatch = cleanup).
### Join architecture
```
NPPES (10M providers, addr/phone/fax) <- outreach contact + base universe
⨝ NPI ⨝
Revalidation Due (217,968 overdue) <- flagship "expired" hook + $399
⨝ NPI ⨝
LEIE (8,608 excluded w/ NPI) <- screening product + urgent flag
⨝ NPI ⨝
Opt-Out (22,379 ending <12mo) <- renewal hook
⨝ NPI ⨝
PPEF / Order-Referring <- enrollment gaps / lead qualification
```
### Headline takeaway
The single best, defensible, dateable hook is **217,968 providers with OVERDUE
Medicare revalidation**, each enrichable with NPPES address/phone/fax for
outreach. That is a larger and harder-deadline audience than the FCC RMD list,
and the $399 revalidation filing is a clean flagship product.
---
## 8. Free Email Append for NPI — VERIFIED FINDINGS
**Yes, there is a partial free email source, plus a free verification path.**
Investigated and tested live on the session date.
### 8.1 NPPES Endpoint file = free, NPI-keyed email addresses
The NPPES dissemination ZIP contains a separate **`endpoint_pfile`** (123 MB)
that we had not previously parsed. It holds electronic endpoints keyed by **NPI**.
Verified contents:
- **597,927 endpoint rows, covering 491,761 distinct NPIs.**
- **390,639 rows are email-formatted** (`user@domain.tld`).
- Endpoint types: DIRECT 356,394 · CONNECT 91,616 · SOAP 56,543 · FHIR 46,764 ·
OTHERS 45,938 · REST 672.
**The honest catch:** most are **Direct Secure Messaging (HISP) addresses**, not
normal inboxes. The top domains are health-system Direct gateways
(`ehrdirect.mayoclinicmsg.org`, `direct.iuhealth.org`, `upmcdirect.com`, …).
Direct addresses route only inside the DirectTrust network — **you cannot cold-
email them from a normal mail server; they will not deliver.** So the raw 390k is
NOT a usable marketing email list.
**BUT — the usable slice:** a meaningful subset are **real consumer/business
inboxes** the provider self-published:
- **~19,759 rows on common consumer webmail** (gmail.com 12,427, plus yahoo,
hotmail, outlook, aol, icloud).
- Verified samples are clearly personal/practice inboxes: `tcneurology@gmail.com`,
`veinsofkc@yahoo.com`, `kendalncarlsondmd@gmail.com`, `scottcopt@aol.com`, etc.
- Plus an additional long tail of non-Direct **practice-domain** emails
(clinic websites) that are also normal inboxes.
So the genuinely free, cold-emailable slice from NPPES endpoints is on the order
of **tens of thousands** (consumer webmail + real practice domains), not the full
391k. Still free, still NPI-keyed (joins to revalidation/LEIE/etc.), and exactly
the small-practice owner-operators who are our buyer.
### 8.2 Free SMTP/MX verification is possible from our infra
Tested from this host:
- **Port 25 egress is OPEN** (connected to `gmail-smtp-in.l.google.com:25`, got
`220` banner).
- **MX lookups work** (resolved MX for gmail.com, mayoclinic.org).
That means we can run **free email verification** ourselves (MX check + SMTP
RCPT-TO probe) with no paid validation vendor, to:
1. Filter the endpoint emails down to deliverable ones.
2. Verify guessed emails for the domain-inference path below.
> Caution: aggressive SMTP probing can get an IP greylisted/blocked. Throttle,
> rotate, and prefer MX-only validation where possible. Do it from a non-sending
> IP so it never touches our warmed MTA reputation.
### 8.3 Free domain-inference append (for the rest)
For NPIs without a usable endpoint email but with an org name + practice address:
1. Find the practice **website** (search name + city; or guess `name.com`).
2. Generate candidate emails (`info@`, `office@`, `contact@`, `first.last@`).
3. **MX + SMTP verify for free** (8.2) and keep only deliverable.
This is zero-cost compute, just our time/infra. Lower hit rate than a paid append
vendor but free.
### 8.4 Bottom line
| Path | Cost | Yield | Cold-emailable? |
|---|---|---|---|
| NPPES endpoint Direct addresses (~356k) | free | high count | ❌ no (HISP-only routing) |
| NPPES endpoint consumer/practice inboxes (~20k+) | free | tens of thousands | ✅ yes |
| Domain-inference + free SMTP verify | free (compute) | medium, varies | ✅ yes |
| Paid B2B email append vendor | $ per match | highest | ✅ yes |
**Recommendation:** build a free pipeline = (a) extract the cold-emailable subset
of endpoint emails, (b) domain-infer + free-SMTP-verify the rest, (c) fall back to
phone/fax/mail for non-matches. This recovers a real email channel for a
meaningful chunk of the 217,968 overdue-revalidation targets at **zero vendor
cost**, and we verify deliverability ourselves since port 25 + MX both work here.
---
## 9. NPI Outreach Pipeline — BUILT & RUN (`scripts/build_npi_outreach_lists.py`)
A reusable pipeline that joins the free NPPES endpoint emails to the CMS
revalidation list and cross-flags LEIE + opt-out. Run against live data:
### Verified output (session date)
| Segment | Rows | Use |
|---|---|---|
| **All cold-emailable NPIs** | **120,408** | broad Provider Compliance Bundle campaign — START HERE |
| ↳ of which overdue revalidation | 1,909 | hottest: lead with the $399 revalidation hook |
| ↳ of which upcoming revalidation | 500 | "due soon" pre-emptive offer |
| ↳ no current reval flag | 117,999 | general compliance bundle / screening / HIPAA |
| **Direct-secure (DirectTrust later)** | 3,897 (overdue) / 235,747 total | park until DirectTrust signup, then send via HISP |
Cold-emailable universe = **120,408 normal inboxes** (consumer webmail + practice
domains), all NPI-keyed. Direct/HISP universe = **235,747** addresses held for the
DirectTrust channel once you sign up — that becomes a huge, spam-resistant,
high-trust second wave.
### Strategy confirmed
- **Start now:** email the 120,408 cold inboxes the Provider Compliance Bundle,
leading the 1,909 overdue with the revalidation deadline.
- **Phase 2 (DirectTrust):** once registered, the 235,747 Direct addresses are a
second, higher-deliverability channel (DirectTrust is closed/trusted, not
spam-filtered like normal email).
- MX/SMTP-verify the cold list first (port 25 + MX confirmed working on our infra)
to strip dead addresses before sending and protect MTA reputation.
Output CSVs: `npi_all_cold_emailable.csv`, `npi_overdue_cold_emailable.csv`,
`npi_overdue_direct_secure.csv` (NPI, email, name, specialty, state, due date,
days overdue, LEIE flag, opt-out ending).