docs: verify NPPES + EPA RCRA field schemas against live files
This commit is contained in:
parent
70d05e0607
commit
5e4e73674a
1 changed files with 70 additions and 29 deletions
|
|
@ -15,24 +15,44 @@ sectors and, critically, **how to reach the license holders besides postal mail.
|
|||
## 1. NPPES / Healthcare Providers (NPI)
|
||||
|
||||
**Source:** CMS NPPES monthly full-replacement dissemination file (free bulk CSV,
|
||||
millions of rows). Cross-joinable with OIG LEIE (exclusions) and the CMS
|
||||
revalidation list, both free.
|
||||
~10M rows). Verified live against `npidata_pfile_20050523-20260510.csv`
|
||||
(`download.cms.gov/nppes/`). Cross-joinable with OIG LEIE (exclusions) and the
|
||||
CMS revalidation list, both free.
|
||||
|
||||
**Email in file:** No. Practice/mailing address, phone, fax only.
|
||||
**Email in file:** ❌ **VERIFIED — no email field exists** (file has 104 columns;
|
||||
none is email). Contact info available: **mailing + practice TELEPHONE
|
||||
(cols 27, 35), mailing + practice FAX (cols 28, 36)**, full mailing + practice
|
||||
addresses, and Authorized Official telephone (col 47). So channel = fax, phone,
|
||||
mail, or email-append. Not email-native.
|
||||
|
||||
### Detectable from the file
|
||||
### Verified columns we care about (104-col file)
|
||||
| Col # | Field (exact) |
|
||||
|---|---|
|
||||
| 1, 2 | NPI, Entity Type Code (1=individual, 2=org) |
|
||||
| 5–11 | Org legal name / provider name + credential |
|
||||
| 21–28 | Mailing address, **mailing telephone (27)**, **mailing fax (28)** |
|
||||
| 29–36 | Practice location address, **practice telephone (35)**, **practice fax (36)** |
|
||||
| 37 | Provider Enumeration Date |
|
||||
| 38 | Last Update Date |
|
||||
| 39, 40, 41 | NPI Deactivation Reason Code, Deactivation Date, Reactivation Date |
|
||||
| 43–47 | Authorized Official name/title + **telephone (47)** |
|
||||
| 48–103 | Up to **15× {Taxonomy Code, License Number, License State Code, Primary Taxonomy Switch}** |
|
||||
|
||||
> Note: the public file does NOT contain a "Is Sole Proprietor" or EIN-validated
|
||||
> field in a usable way (EIN col 4 is usually masked). Earlier guess corrected.
|
||||
|
||||
### Detectable from the file (verified)
|
||||
| Signal | Field(s) | Obligation | Service |
|
||||
|---|---|---|---|
|
||||
| Stale `Last Update Date` (>1–2 yrs) | Last Update Date | NPPES update within 30 days of any change | NPPES refresh/attestation |
|
||||
| Deactivated NPI | NPI Deactivation Date / Reactivation Date | Deactivated NPI can't bill | NPI reactivation |
|
||||
| Old enumeration + never updated | Provider Enumeration Date vs Last Update Date | Likely overdue Medicare revalidation (5-yr) | PECOS revalidation |
|
||||
| Taxonomy vs license-state mismatch | Taxonomy, License Number, License State | Specialty/license inconsistency | License/taxonomy reconcile |
|
||||
| No primary taxonomy flagged | taxonomy primary switch | Billing/credentialing errors | Taxonomy cleanup |
|
||||
| Org (Type 2) missing Authorized Official | Authorized Official block | Incomplete org NPI | Org NPI correction |
|
||||
| Sole-proprietor flag vs entity-type conflict | Is Sole Proprietor, Entity Type Code | Enrollment/tax classification issue | Enrollment review |
|
||||
| Stale `Last Update Date` (>1–2 yrs) | col 38 | NPPES update within 30 days of any change | NPPES refresh/attestation |
|
||||
| Deactivated NPI | cols 39–41 | Deactivated NPI can't bill | NPI reactivation |
|
||||
| Old enumeration + never updated | col 37 vs 38 | Likely overdue Medicare revalidation (5-yr) | PECOS revalidation |
|
||||
| Taxonomy w/ license but no license-state | taxonomy/license/state sets | License/specialty inconsistency | License/taxonomy reconcile |
|
||||
| No primary taxonomy flagged (switch all N) | Primary Taxonomy Switch_n | Billing/credentialing errors | Taxonomy cleanup |
|
||||
| Org (Type 2) missing Authorized Official | cols 2, 43–47 | Incomplete org NPI | Org NPI correction |
|
||||
|
||||
**Inferable only (not in file):** exact revalidation due date (PECOS), HIPAA
|
||||
posture, active billing, sanctions (use OIG LEIE join).
|
||||
posture, active billing, sanctions (use OIG LEIE join), email.
|
||||
|
||||
**Best cross-join hook:** NPPES ⨝ OIG LEIE ⨝ CMS revalidation list.
|
||||
|
||||
|
|
@ -62,29 +82,50 @@ Closest analog to FCC RMD in size and clock.
|
|||
|
||||
## 3. EPA RCRA Hazardous Waste Handlers (via ECHO / RCRAInfo / FRS)
|
||||
|
||||
**Source:** ECHO downloadable files, RCRAInfo public data, Facility Registry
|
||||
Service. Richest enforcement data of the three. Cross-join with TRI.
|
||||
**Source:** ECHO bulk files (`echo.epa.gov/files/echodownloads/`) — verified live.
|
||||
Two relevant downloads:
|
||||
- **`ECHO_EXPORTER`** (137 cols) — one row per facility across all programs, holds
|
||||
the compliance signals. Column dict: `echo_exporter_columns_*.xlsx`.
|
||||
- **`rcra_downloads.zip`** — 6 RCRA-specific CSVs: `RCRA_FACILITIES.csv` (15 cols),
|
||||
`RCRA_VIOLATIONS.csv`, `RCRA_EVALUATIONS.csv`, `RCRA_ENFORCEMENTS.csv`,
|
||||
`RCRA_NAICS.csv`, `RCRA_VIOSNC_HISTORY.csv`.
|
||||
|
||||
**Email in file:** Largely absent. Facility/owner contact name, phone, mailing
|
||||
address present.
|
||||
**Email in file:** ❌ **VERIFIED — no email anywhere in ECHO bulk.**
|
||||
`RCRA_FACILITIES.csv` has only: `ID_NUMBER, FACILITY_NAME, ACTIVITY_LOCATION,
|
||||
FULL_ENFORCEMENT, HREPORT_UNIVERSE_RECORD, STREET_ADDRESS, CITY_NAME, STATE_CODE,
|
||||
ZIP_CODE, LATITUDE83, LONGITUDE83, FED_WASTE_GENERATOR, TRANSPORTER, ACTIVE_SITE,
|
||||
OPERATING_TSDF`. **No contact name, no phone, no email** in ECHO RCRA. Owner/
|
||||
operator contact NAME + PHONE (still no email) exists only in the deeper RCRAInfo
|
||||
handler download (`rcrapublic.epa.gov`), where a PHONE field is present.
|
||||
So channel = phone (from RCRAInfo) + mail + email-append. Not email-native.
|
||||
|
||||
### Detectable from the data
|
||||
### Verified ECHO_EXPORTER RCRA signal columns
|
||||
`RCRA_FLAG`, `RCRA_IDS`, `RCRA_PERMIT_TYPES`, `RCRA_NAICS`,
|
||||
`RCRA_INSPECTION_COUNT`, `RCRA_DAYS_LAST_EVALUATION`, `RCRA_INFORMAL_COUNT`,
|
||||
`RCRA_FORMAL_ACTION_COUNT`, `RCRA_DATE_LAST_FORMAL_ACTION`, `RCRA_PENALTIES`,
|
||||
`RCRA_LAST_PENALTY_DATE`, `RCRA_LAST_PENALTY_AMT`, `RCRA_QTRS_WITH_NC`,
|
||||
`RCRA_COMPLIANCE_STATUS`, `RCRA_SNC_FLAG`, `RCRA_3YR_COMPL_QTRS_HISTORY`. Plus
|
||||
facility-level: `FAC_DATE_LAST_INSPECTION`, `FAC_SNC_FLG`, `FAC_COMPLIANCE_STATUS`.
|
||||
|
||||
### Detectable from the data (verified)
|
||||
| Signal | Field(s) | Obligation | Service |
|
||||
|---|---|---|---|
|
||||
| Generator status LQG/SQG/VSQG | handler classification | Biennial report + manifest + training | Generator program |
|
||||
| Biennial report not filed | RCRAInfo biennial flag | LQG Biennial Report (odd yrs, by Mar 1) | Biennial filing |
|
||||
| Open/current violation | ECHO CurrViolation/history | Return-to-compliance | Violation remediation |
|
||||
| SNC / HPV flag | ECHO SNC/SVQ flags | High enforcement priority | Audit prep + corrective |
|
||||
| Old inspection + LQG | last inspection date | Overdue inspection risk | Self-audit |
|
||||
| Permit expired/expiring | permit status/expiration | TSDF permit renewal | Permit renewal |
|
||||
| Stale SQG re-notification | notification date | SQG re-notify (~4 yrs, state-dependent) | Re-notification |
|
||||
| NAICS implies waste, no RCRA ID | FRS NAICS w/o RCRA link | Should be registered as generator | Generator registration |
|
||||
| EPCRA/Tier II non-filer | facility + chemical thresholds | Tier II annual report (by Mar 1) | Tier II / SPCC filing |
|
||||
| Generator status (LQG/SQG/VSQG) | `FED_WASTE_GENERATOR` (1/2/3/N), `RCRA_PERMIT_TYPES` | Biennial report + manifest + training | Generator program |
|
||||
| Open/current violation | `RCRA_COMPLIANCE_STATUS`, `RCRA_QTRS_WITH_NC` | Return-to-compliance | Violation remediation |
|
||||
| SNC flag | `RCRA_SNC_FLAG`, `FAC_SNC_FLG` | High enforcement priority | Audit prep + corrective |
|
||||
| Old/never evaluated + LQG | `RCRA_DAYS_LAST_EVALUATION`, `FAC_DATE_LAST_INSPECTION` | Overdue inspection risk | Self-audit |
|
||||
| Recent penalty / formal action | `RCRA_PENALTIES`, `RCRA_DATE_LAST_FORMAL_ACTION` | Active enforcement | Remediation/defense |
|
||||
| TSDF without active permit | `OPERATING_TSDF`, `RCRA_PERMIT_TYPES` | TSDF permit renewal | Permit renewal |
|
||||
| NAICS implies waste, no RCRA ID | `RCRA_NAICS` / FRS NAICS w/o `RCRA_FLAG` | Should be registered as generator | Generator registration |
|
||||
| Cross-program: RCRA + TRI reporter | `RCRA_FLAG` + `TRI_FLAG` | EPCRA/Tier II overlap | Tier II / SPCC filing |
|
||||
|
||||
**Inferable only:** SPCC plan existence, actual chemical inventory, contact email.
|
||||
**Inferable only (not in file):** biennial-report-not-filed status (need RCRAInfo
|
||||
BR module, not in ECHO bulk), SPCC plan existence, actual chemical inventory,
|
||||
contact email. (Earlier "biennial flag" claim corrected — ECHO bulk does not
|
||||
expose a clean biennial-filed flag.)
|
||||
|
||||
**Cross-join opportunity:** ECHO ⨝ TRI ⨝ FRS NAICS to find facilities that
|
||||
should be reporting but aren't.
|
||||
**Cross-join opportunity:** ECHO_EXPORTER `RCRA_FLAG` + `TRI_FLAG` + `FAC_NAICS_CODES`
|
||||
to find facilities that should be reporting but aren't.
|
||||
|
||||
---
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue