trucking: weekly FMCSA source refresh so new non-compliant carriers are caught
The FMCSA census was a one-time snapshot (last loaded ~May 30) with NO refresh timer -- carriers newly falling out of MCS-150/UCR compliance were never picked up. New scripts/workers/fmcsa_source_refresh.py orchestrates the full pipeline (census download -> enrichment -> deficiency flag -> verify new emails -> MX-tag new) and runs weekly via cron pw-fmcsa-refresh (Sun 09:00 UTC), codified in the mail-pipeline Ansible role. Idempotent + incremental: the census upsert preserves email_verified / listmonk_sent_at / deficiency_flags, so existing carriers keep their send state and only census fields refresh; new DOTs flow into verification then campaigns. A carrier who refiled gets a fresh mcs150_parsed, so the builder's overdue WHERE clause stops targeting them automatically. Verify is capped per run (20k) so it never stalls on millions of rows. (Healthcare already auto-catches newly-revalidation-overdue providers within its 63k institutional pool via pw-hc-refresh Mon/Wed/Fri.)
This commit is contained in:
parent
4171f48736
commit
899b880e7f
4 changed files with 144 additions and 0 deletions
|
|
@ -112,6 +112,7 @@
|
|||
- pw-hc-campaign
|
||||
- pw-hc-nppes
|
||||
- pw-hc-refresh
|
||||
- pw-fmcsa-refresh
|
||||
- pw-mta-warmup
|
||||
- pw-listmonk-rampcap
|
||||
- pw-hc-rampcap
|
||||
|
|
|
|||
12
infra/cron/pw-fmcsa-refresh
Normal file
12
infra/cron/pw-fmcsa-refresh
Normal file
|
|
@ -0,0 +1,12 @@
|
|||
# Weekly FMCSA trucking-source refresh. Re-ingests the full FMCSA motor-carrier
|
||||
# census from the live Socrata API and re-runs enrichment -> flagging ->
|
||||
# verification -> MX-tagging, so the daily campaign builders automatically catch
|
||||
# carriers who NEWLY fell out of compliance (e.g. an MCS-150 update that just
|
||||
# lapsed) and drop carriers who have since refiled. Idempotent + incremental:
|
||||
# the upsert preserves email_verified / listmonk_sent_at / deficiency_flags, so
|
||||
# existing carriers keep their send state and only census fields refresh; new
|
||||
# DOTs flow into verification then campaigns. The original census load was a
|
||||
# one-time snapshot with no refresh timer -- this closes that gap. Runs Sunday
|
||||
# 09:00 UTC (off-peak, well before Monday's 08:00 UTC trucking builder).
|
||||
# Takes a while (full ~2M-row download + verify batch), so it runs off-peak.
|
||||
0 9 * * 0 deploy cd /opt/performancewest && docker compose exec -T workers python3 -m scripts.workers.fmcsa_source_refresh >> /opt/performancewest/logs/pw-fmcsa-refresh.log 2>&1
|
||||
Loading…
Add table
Add a link
Reference in a new issue