From 091ebbd7f97b46cf0773cddf27ef7a6376b16137 Mon Sep 17 00:00:00 2001 From: justin Date: Fri, 5 Jun 2026 01:00:54 -0500 Subject: [PATCH] docs: verified free NPI email-append paths (NPPES endpoint file + free SMTP/MX verify) --- docs/new-sector-compliance-targets.md | 74 +++++++++++++++++++++++++++ 1 file changed, 74 insertions(+) diff --git a/docs/new-sector-compliance-targets.md b/docs/new-sector-compliance-targets.md index 2caf01a..a4a8bee 100644 --- a/docs/new-sector-compliance-targets.md +++ b/docs/new-sector-compliance-targets.md @@ -403,3 +403,77 @@ The single best, defensible, dateable hook is **217,968 providers with OVERDUE Medicare revalidation**, each enrichable with NPPES address/phone/fax for outreach. That is a larger and harder-deadline audience than the FCC RMD list, and the $399 revalidation filing is a clean flagship product. + +--- + +## 8. Free Email Append for NPI — VERIFIED FINDINGS + +**Yes, there is a partial free email source, plus a free verification path.** +Investigated and tested live on the session date. + +### 8.1 NPPES Endpoint file = free, NPI-keyed email addresses +The NPPES dissemination ZIP contains a separate **`endpoint_pfile`** (123 MB) +that we had not previously parsed. It holds electronic endpoints keyed by **NPI**. +Verified contents: +- **597,927 endpoint rows, covering 491,761 distinct NPIs.** +- **390,639 rows are email-formatted** (`user@domain.tld`). +- Endpoint types: DIRECT 356,394 · CONNECT 91,616 · SOAP 56,543 · FHIR 46,764 · + OTHERS 45,938 · REST 672. + +**The honest catch:** most are **Direct Secure Messaging (HISP) addresses**, not +normal inboxes. The top domains are health-system Direct gateways +(`ehrdirect.mayoclinicmsg.org`, `direct.iuhealth.org`, `upmcdirect.com`, …). +Direct addresses route only inside the DirectTrust network — **you cannot cold- +email them from a normal mail server; they will not deliver.** So the raw 390k is +NOT a usable marketing email list. + +**BUT — the usable slice:** a meaningful subset are **real consumer/business +inboxes** the provider self-published: +- **~19,759 rows on common consumer webmail** (gmail.com 12,427, plus yahoo, + hotmail, outlook, aol, icloud). +- Verified samples are clearly personal/practice inboxes: `tcneurology@gmail.com`, + `veinsofkc@yahoo.com`, `kendalncarlsondmd@gmail.com`, `scottcopt@aol.com`, etc. +- Plus an additional long tail of non-Direct **practice-domain** emails + (clinic websites) that are also normal inboxes. + +So the genuinely free, cold-emailable slice from NPPES endpoints is on the order +of **tens of thousands** (consumer webmail + real practice domains), not the full +391k. Still free, still NPI-keyed (joins to revalidation/LEIE/etc.), and exactly +the small-practice owner-operators who are our buyer. + +### 8.2 Free SMTP/MX verification is possible from our infra +Tested from this host: +- **Port 25 egress is OPEN** (connected to `gmail-smtp-in.l.google.com:25`, got + `220` banner). +- **MX lookups work** (resolved MX for gmail.com, mayoclinic.org). + +That means we can run **free email verification** ourselves (MX check + SMTP +RCPT-TO probe) with no paid validation vendor, to: +1. Filter the endpoint emails down to deliverable ones. +2. Verify guessed emails for the domain-inference path below. + +> Caution: aggressive SMTP probing can get an IP greylisted/blocked. Throttle, +> rotate, and prefer MX-only validation where possible. Do it from a non-sending +> IP so it never touches our warmed MTA reputation. + +### 8.3 Free domain-inference append (for the rest) +For NPIs without a usable endpoint email but with an org name + practice address: +1. Find the practice **website** (search name + city; or guess `name.com`). +2. Generate candidate emails (`info@`, `office@`, `contact@`, `first.last@`). +3. **MX + SMTP verify for free** (8.2) and keep only deliverable. +This is zero-cost compute, just our time/infra. Lower hit rate than a paid append +vendor but free. + +### 8.4 Bottom line +| Path | Cost | Yield | Cold-emailable? | +|---|---|---|---| +| NPPES endpoint Direct addresses (~356k) | free | high count | ❌ no (HISP-only routing) | +| NPPES endpoint consumer/practice inboxes (~20k+) | free | tens of thousands | ✅ yes | +| Domain-inference + free SMTP verify | free (compute) | medium, varies | ✅ yes | +| Paid B2B email append vendor | $ per match | highest | ✅ yes | + +**Recommendation:** build a free pipeline = (a) extract the cold-emailable subset +of endpoint emails, (b) domain-infer + free-SMTP-verify the rest, (c) fall back to +phone/fax/mail for non-matches. This recovers a real email channel for a +meaningful chunk of the 217,968 overdue-revalidation targets at **zero vendor +cost**, and we verify deliverability ourselves since port 25 + MX both work here.