# NOTE: logs go to /opt/performancewest/logs/ (deploy-owned). The deploy user # cannot write /var/log, so a /var/log redirect makes cron silently fail before # the command runs. Ensure /opt/performancewest/logs exists + is deploy-owned. # Healthcare data refresh: re-check every emailable NPI against the live # government sources (CMS Revalidation list, OIG LEIE) + MX re-classification # (Google-host detection) so warmup sends never go stale. Runs Mon/Wed/Fri 06:00 # Central, ~1h before the 07:00 weekday send. Mon/Wed/Fri (vs weekly) shrinks the # CMS data-lag window to ~2-3 days, so a provider who just completed their # revalidation stops being targeted faster (fewer "already done" replies). # Takes ~8 min. SAM is opt-in (--sam-pages); SAM exclusions rarely carry an NPI, # so OIG LEIE is the NPI-bearing exclusion source. Pipeline: # 1. hc_data_refresh.py -- re-verify NPIs vs CMS/OIG + MX reclassify # 2. download CMS revalidation_base.csv (institutional revalidation dates) # 3. enrich_institutional_revalidation.py -- merge reval dates into the # institutional CSV consumed by the pw-hc-nppes builder # 4. build_healthcare_campaigns_cron.py --prune-only -- evict newly-Google- # hosted + suppressed subscribers from the warmup lists 0 6 * * 1,3,5 deploy cd /opt/performancewest && python3 -u scripts/hc_data_refresh.py >> /opt/performancewest/logs/pw-hc-refresh.log 2>&1 && curl -s "https://data.cms.gov/sites/default/files/2026-05/96484587-20ec-4070-a4de-cd7de3ec0093/revalidation_base.csv" -o data/npi_build/revalidation_base.csv 2>>/opt/performancewest/logs/pw-hc-refresh.log && python3 -u scripts/enrich_institutional_revalidation.py data/hc_nppes_institutional_verified.csv data/npi_build/revalidation_base.csv data/hc_nppes_institutional_enriched.csv >> /opt/performancewest/logs/pw-hc-refresh.log 2>&1 && python3 -u scripts/build_healthcare_campaigns_cron.py --prune-only >> /opt/performancewest/logs/pw-hc-refresh.log 2>&1