The FMCSA census was a one-time snapshot (last loaded ~May 30) with NO refresh
timer -- carriers newly falling out of MCS-150/UCR compliance were never picked
up. New scripts/workers/fmcsa_source_refresh.py orchestrates the full pipeline
(census download -> enrichment -> deficiency flag -> verify new emails ->
MX-tag new) and runs weekly via cron pw-fmcsa-refresh (Sun 09:00 UTC), codified
in the mail-pipeline Ansible role.
Idempotent + incremental: the census upsert preserves email_verified /
listmonk_sent_at / deficiency_flags, so existing carriers keep their send state
and only census fields refresh; new DOTs flow into verification then campaigns.
A carrier who refiled gets a fresh mcs150_parsed, so the builder's overdue
WHERE clause stops targeting them automatically. Verify is capped per run
(20k) so it never stalls on millions of rows.
(Healthcare already auto-catches newly-revalidation-overdue providers within
its 63k institutional pool via pw-hc-refresh Mon/Wed/Fri.)
trucking-system.md: comprehensive doc covering data pipeline, compliance
checker (14 checks), 21 service slugs, vendors, database tables, frontend
pages, order flow, and campaign segmentation (29,792 eligible).
trucking-architecture.svg: visual diagram of full system — data sources,
processing layer, database, frontend, campaigns, and order flow.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>