new-site

Author	SHA1	Message	Date
justin	8e5590b492	mail: DMARC aggregate-report parser + dedicated dmarc@ mailbox ingestion Tool 2 of the deliverability monitoring pair (Tool 1 = mail_reputation_monitor). DMARC rua reports from dozens of operators (Google, Yahoo, Comcast, Cox, Bell, Mimecast, Cisco ESA, GMX, mail.com, ...) were landing in ops@ (dmarc@ was a DL), burying real mail and never parsed. Now ingested + queryable: - dmarc@performancewest.net converted DL -> dedicated Carbonio mailbox; isolated IMAP creds in server .env, surfaced to workers in docker-compose.yml (mirrors OPS_IMAP_*). 29 historical reports moved ops@ -> dmarc@ via IMAP. - scripts/dmarc_report_parser.py: IMAP fetch unseen -> decompress .gz/.zip/.xml (namespace-agnostic: classic + urn:ietf:params:xml:ns:dmarc-2.0 GMX/mail.com) -> parse aggregate XML -> upsert dmarc_report (keyed (org_name,report_id), no-op on re-parse) + dmarc_record per source IP. dmarc_pass = dkim_aligned OR spf_aligned. Marks \Seen. --dry-run/--all/--alert (7d per-IP summary + Telegram if one of OUR IPs <95% pass, or EXTERNAL IP sends >=20 failing msgs as us = spoofing under p=reject). psycopg2 imported lazily so --dry-run runs without the driver. - api/migrations/102_dmarc_aggregate.sql: dmarc_report + dmarc_record tables. - infra/cron/pw-dmarc-parser: 06:20 UTC daily --alert (after reputation, before scrub). - docs/deliverability.md: DMARC section DONE; query examples. Verified: dry-run --all parses all 28 reports (1 non-report test probe), 0 unknown after the namespace fix.	2026-06-19 08:50:20 -05:00
justin	b45332b5f7	infra(cron): nightly mail-reputation snapshot (pw-mail-reputation) Runs mail_reputation_monitor --alert at 06:10 UTC, piping the day's postfix log (sudo cat, same pattern as pw-warmup-tg-alert) into the DB-connected workers container. Builds the daily SNDS-equivalent reputation trend and Telegram-alerts on operator regressions. Installed to /etc/cron.d/pw-mail-reputation.	2026-06-19 08:38:35 -05:00
justin	72c69a05c9	infra(cron): daily Listmonk consumer-domain reconciliation (pw-listmonk-scrub) Runs scrub_listmonk_consumer against both listmonk and listmonk_hc at 06:30 UTC, before the campaign builders, so any ENABLED subscriber matching the authoritative exclusion list is blocklisted retroactively. Keeps list-based campaigns (FCC Direct Contacts, CRTC/USF, etc.) from leaking onto consumer mailboxes after a new domain (e.g. Apple/iCloud) is added to the exclusion list. Installed to /etc/cron.d/pw-listmonk-scrub on the host.	2026-06-19 00:00:46 -05:00
justin	899b880e7f	trucking: weekly FMCSA source refresh so new non-compliant carriers are caught The FMCSA census was a one-time snapshot (last loaded ~May 30) with NO refresh timer -- carriers newly falling out of MCS-150/UCR compliance were never picked up. New scripts/workers/fmcsa_source_refresh.py orchestrates the full pipeline (census download -> enrichment -> deficiency flag -> verify new emails -> MX-tag new) and runs weekly via cron pw-fmcsa-refresh (Sun 09:00 UTC), codified in the mail-pipeline Ansible role. Idempotent + incremental: the census upsert preserves email_verified / listmonk_sent_at / deficiency_flags, so existing carriers keep their send state and only census fields refresh; new DOTs flow into verification then campaigns. A carrier who refiled gets a fresh mcs150_parsed, so the builder's overdue WHERE clause stops targeting them automatically. Verify is capped per run (20k) so it never stalls on millions of rows. (Healthcare already auto-catches newly-revalidation-overdue providers within its 63k institutional pool via pw-hc-refresh Mon/Wed/Fri.)	2026-06-17 20:44:54 -05:00
justin	4dc5690666	infra: codify the email-campaign pipeline in Ansible (new mail-pipeline role) The entire outbound campaign pipeline lived ONLY on the host and was never in IaC -- a fresh rebuild would have silently shipped NO campaigns, NO IP warmup/ ramp, and NO bounce processing. New mail-pipeline role + deploy-mail-pipeline.yml playbook deploy it from the canonical repo copies: cron.d (infra/cron/): - pw-trucking-campaign-builder, pw-ifta-campaign, pw-ucr-campaign - pw-hc-campaign, pw-hc-nppes, pw-hc-refresh - pw-mta-warmup, pw-listmonk-rampcap, pw-hc-rampcap - pw-ip-rehab, pw-warmup-tg-alert helper scripts (-> /usr/local/bin): - pw-mta-warmup, pw-listmonk-rampcap, pw-hc-rampcap, pw-warmup-tg-alert - postfix-bounce-notify.sh, postfix-hc-bounce-notify.sh, listmonk-bounce-sync.py systemd services: - pw-bounce-watcher.service (was missing from repo), pw-hc-bounce-watcher.service Also creates the deploy-owned {{project_dir}}/logs dir (deploy can't write /var/log, so a missing dir made cron redirects fail). Added the 6 cron.d files that existed only on the host, the trucking bounce-watcher unit, and synced infra/cron/pw-hc-refresh to the live version (revalidation download + enrich steps). Role wired into site.yml after the mail (OpenDKIM) role. Part of the email-deliverability incident hardening.	2026-06-17 20:26:01 -05:00
justin	2caab6aa69	hc: warmup must run DAILY for the full 21-day ramp (not weekdays-only) The HC warmup crons were '* * 1-5' (Mon-Fri), silently skipping weekends -- but a proper warmup needs CONTINUOUS daily volume for 21 days (mailbox providers reward consistency; gaps stall reputation). The Jun 14 'HC 0 sent' alert was just a skipped Sunday, but the weekend skips also broke ramp continuity. - pw-hc-campaign + pw-hc-nppes: '* * 1-5' -> '* * *' (daily), vendored + applied live. - Re-aligned the warmup start stamp from calendar-day 9 to send-day 5 so the volume ramp matches reputation actually built (it had skipped ~4 weekend days, running the ramp ahead of real history). - Fixed the stale 'Mon-Fri only' comment in daily_slice(). - Vendored nppes cron now carries the enriched-CSV + 4-segment config.	2026-06-14 21:02:08 -05:00
justin	ff4ab262a8	hc: cron to feed NPPES institutional base (63k verified) into warmup, MX-throttled Adds /etc/cron.d/pw-hc-nppes (weekdays 07:30) that imports the verified NPPES institutional general-compliance base into the OIG screening segment, throttled per MX operator. Separate from the 07:00 reval-segment run so the two pipelines stay independent. Vendored the cron file under infra/cron/.	2026-06-12 22:11:12 -05:00
justin	25f4a7503b	warmup: IP rehab for .91-.93 so they can be reallocated The 3 IPs (mta02-04 / .91-.93) retired after the May 30-31 over-volume blast are NOT on any DNSBL (Spamhaus/Barracuda/SpamCop/SORBS all clean) and have clean PTRs + SPF/DKIM/DMARC -- the damage was provider-internal reputation, which recovers with slow clean sending. scripts/ip_rehab.py sends a tiny ramping trickle (10/IP/day -> cap 60) of genuine CAN-SPAM-compliant compliance check-in mail to clean business-domain, never-bounced recipients via dedicated heavily-throttled postfix transports rehab02/03/04 (30s/msg, bound to .91/.92/.93). Routing uses an X-PW-Rehab-IP header + header_checks FILTER to override the transport_maps randmap warmup rotation (verified: mail routes via rehab transports, status=sent). Daily cron pw-ip-rehab. After ~2-3 weeks of clean sending the IPs can be reallocated.	2026-06-09 20:27:47 -05:00
justin	9fa2c86f01	fix(warmup): HC cron logged to /var/log (deploy can't write) -> cron silently died The HC warmup builder ran from cron at 07:00 but the >> /var/log/pw-hc-campaign.log redirect failed (deploy user cannot write /var/log), and a failed output redirect makes cron abort the command BEFORE it runs -> HC sent 0/day since the log file was removed. Route HC cron logs to /opt/performancewest/logs/ (deploy-owned) so the redirect always succeeds. Builder itself was fine (verified: imports + sends work, 0 bounces). Also removed the stale 'campaign-warmup.sh 122' root-cron line that pointed at a finished campaign + no longer existed.	2026-06-09 16:06:28 -05:00
justin	7c39a858cc	monitoring: daily warmup IP-reputation Telegram alert End-of-day (20:00 Central) check of campaign deliverability across both sending pools (main out05-09 + healthcare hcout). Sends a Telegram alert ONLY when there is a reputation problem -- delivery below 65% or a spam/policy-block (550-5.7.1) spike above 150/day -- so healthy days stay silent. Reuses the existing TELEGRAM_BOT_TOKEN/CHAT_ID from /opt/performancewest/.env. Logs every run to /var/log/pw-warmup-healthcheck.log for history. Excludes internal/probe noise so the delivery figure reflects real external recipients.	2026-06-08 21:06:41 -05:00
justin	2156a5e05f	hc refresh: run Mon/Wed/Fri instead of weekly to shrink CMS data-lag The 'already revalidated' replies come from the CMS data-lag window (a provider completes their revalidation but CMS's public Due Date List still shows them overdue for weeks). Running the refresh 3x/week instead of weekly shrinks that window from up to 7 days to ~2-3, so a provider who just completed stops being targeted faster. No change to the overdue window or audience size -- this is the lever that reduces stale-data complaints without losing prospects.	2026-06-08 10:53:36 -05:00
justin	9cb10b18e0	feat(hc): deliverability prune -- evict newly-Google-hosted subscribers Belt-and-suspenders for the edge you flagged: a domain already in a warmup list could flip its MX to Google Workspace between weekly refreshes, after which it would hard-bounce from the cold IP. The import-time guard only catches NEW adds. - prune_holdouts(): enumerates each warmup list's subscribers, matches them against the FRESH master CSV (re-classified weekly), and removes any whose domain is now Google-hosted. DELIVERABILITY-ONLY -- it never evicts for audience reasons (an overdue provider drifting out of the 1-90 day window was a valid target when warmed; re-litigating that just wastes warmup progress). - --prune (run alongside warming) and --prune-only (prune then exit). - Wired into the weekly refresh cron as a --prune-only chained step, so MX is re-checked and holdouts removed every Monday before the weekday sends. Verified end-to-end: with no Google domains in lists it's a 0-op; injecting a simulated Google-flipped domain into the master, the prune correctly detects and (in a real run) would remove it from every list it's on.	2026-06-08 03:39:56 -05:00
justin	feb677f6ce	fix(hc warmup): only mail slightly-overdue providers (deliverability) Mailing heavily-overdue NPIs (months/years past due) risks hitting practices that have closed, merged, or abandoned the inbox -> hard bounces, which are the fastest way to wreck a warming IP's reputation. The warmup now restricts the reval_overdue selector to an inclusive [HC_OVERDUE_MIN, HC_OVERDUE_MAX] window (default 1-90 days) and the OIG 'any' selector likewise excludes heavily-overdue and dropped-off-list rows. On the current cohort this trims the overdue audience 178->96 and the OIG audience 399->317, holding out the stale long tail (181-365d + 366d+). upcoming/active providers are unaffected.	2026-06-08 03:27:22 -05:00
justin	167c4a3847	infra/cron: multi-segment hc warmup + weekly data-refresh cron Tracks the deployed cron.d files in the repo: - pw-hc-campaign: updated comment to reflect the now multi-segment warmup (revalidation + OIG + NPPES + reactivation + bundle); command unchanged. - pw-hc-refresh (NEW): Mon 06:00 Central weekly data refresh, ~1h before the 07:00 weekday send, so every send uses fresh CMS/OIG status.	2026-06-08 03:15:47 -05:00
justin	95698852ce	healthcare warmup: gate Google/Workspace domains out of week 1 (they hard-reject cold IPs 550-5.7.1); send 501 non-Google practice domains first, defer 222 Google to week 2-3; cron uses hc_warmup_nongoogle.csv	2026-06-06 04:02:00 -05:00
justin	2bc86268f7	healthcare: HC warmup campaign cron (Mon-Fri 7AM Central) - imports overdue-first verified slice into listmonk-hc + runs Medicare-revalidation campaign via hc HOT stream; rate-throttled by pw-hc-rampcap	2026-06-06 03:57:08 -05:00

16 commits