new-site

Author	SHA1	Message	Date
justin	744f0a89cf	healthcare: bound NPPES-stale window [3,10]yr + restore verify_ok gate - Add NPPES_STALE_MAX_YEARS (default 10): a record untouched for many years is a stronger signal the practice closed/moved, and a bounce burns the warming IP. Observed institutional distribution clusters 3-7yrs with ~0 beyond 8, so 10 is a safe ceiling that mails the whole real pool while excluding any outlier ancient record. MIN stays 3 (keeps the 'out of date' claim credible). - Restore the SMTP-verification gate (verify_ok) that the shared institutional_verified selector had -- the swap to nppes_stale dropped it; we only mail inboxes we already proved live. - enrich: process the re-fetch queue STALEST-FIRST so a bounded (--limit) or --max-age refresh spends its budget on the most-overdue cache entries (and new NPIs) first, never starving them behind merely-aging ones. - Selector unit-tested (10 cases incl. window edges, verify gate, deactivated).	2026-06-20 15:28:12 -05:00
justin	9e155d214c	healthcare: cite REAL NPPES last_updated date in 'outdated' email The NPPES 'may be out of date' email previously asserted staleness with no per-record evidence (softened earlier to a generic 'periodic review required'). NPPES is fully public and every record carries basic.last_updated, so we now cite the actual government date the provider can verify on the registry. - enrich_nppes_last_updated.py: joins real basic.last_updated / enumeration_date / deactivated onto the institutional list via a cached, resumable per-NPI crawl (no batch endpoint exists). Adds nppes_last_updated, nppes_enumeration, nppes_years_stale, nppes_deactivated. - cron: new 'nppes_stale' selector mails ONLY records >= 3yrs stale (env HC_NPPES_STALE_MIN_YEARS) and excludes deactivated NPIs; empty date => no match, so we never claim staleness without the government date to back it. - template: headline + official-record card now show the real last_updated date and ~N-years-ago, sourced to npiregistry.cms.hhs.gov. - attribs + test SAMPLE expose the new fields; verified render + plaintext.	2026-06-20 15:21:15 -05:00
justin	5c3b4291e7	feat(deliverability): send bulk campaigns from dedicated subdomain send.performancewest.net Isolates bulk sending reputation onto a dedicated subdomain so the root domain stays clean for transactional/verification mail (and recovers faster). Replies still go to the root domain via Reply-To, so the customer-facing reply experience is unchanged. - build_trucking_campaigns.py: add env-overridable FROM_EMAIL (noreply@send.performancewest.net); use it for both scheduled + test sends instead of inheriting base["from_email"] from the DB base campaign. - build_healthcare_campaigns_cron.py: FROM_EMAIL -> compliance@send.performancewest.net (env-overridable). - bounce-watcher.sh / hc-bounce-watcher.sh: track the new subdomain envelope sender (keep legacy root-domain sender so the pre-cutover queue still drains; HC also tracks by hcout transport regardless of sender). Infra already live (separate, non-code): subdomain DNS (A/MX/SPF/DKIM selector=send/DMARC p=reject) on the Hestia master, OpenDKIM signs d=send.performancewest.net (verified end-to-end), egress .94/.107. Root SPF trimmed to the real IPs; pointless IP-rehab cron disabled.	2026-06-18 23:07:23 -05:00
justin	a32a3b05a0	email: add plaintext MIME part + stable Message-ID hostname Two deliverability hardening fixes from the email audit: 1. Plaintext (altbody): all campaigns were HTML-only. Listmonk only emits multipart/alternative when altbody is set, and HTML-only bulk mail is a spam-score signal. New scripts/_email_plaintext.py renders a readable text/plain part from the HTML body (dependency-free; preserves Listmonk {{ .Subscriber }}/{{ UnsubscribeURL }} template tags, turns links into 'text (url)'). Wired into the trucking builder (and thus UCR + IFTA, which reuse create_and_schedule_campaign) and the healthcare builder. 2. Stable container hostname: Listmonk derived its Message-ID from the random docker container id -> @localhost.localdomain (spam-score signal). Pin both listmonk + listmonk-hc hostname to perfwest.performancewest.net, matching Listmonk's SMTP hello_hostname. Part of the email-deliverability incident hardening.	2026-06-17 20:09:02 -05:00
justin	2caab6aa69	hc: warmup must run DAILY for the full 21-day ramp (not weekdays-only) The HC warmup crons were '* * 1-5' (Mon-Fri), silently skipping weekends -- but a proper warmup needs CONTINUOUS daily volume for 21 days (mailbox providers reward consistency; gaps stall reputation). The Jun 14 'HC 0 sent' alert was just a skipped Sunday, but the weekend skips also broke ramp continuity. - pw-hc-campaign + pw-hc-nppes: '* * 1-5' -> '* * *' (daily), vendored + applied live. - Re-aligned the warmup start stamp from calendar-day 9 to send-day 5 so the volume ramp matches reputation actually built (it had skipped ~4 weekend days, running the ramp ahead of real history). - Fixed the stale 'Mon-Fri only' comment in daily_slice(). - Vendored nppes cron now carries the enriched-CSV + 4-segment config.	2026-06-14 21:02:08 -05:00
justin	b73edadb89	hc: unlock the full 62k verified institutional pool for broad offers The OIG-screening + NPPES-update segments were effectively limited to ~1,437 providers because the warmup 'any' selector excluded not-on-reval-list rows as a deliverability proxy -- but that excludes almost the ENTIRE institutional list (org NPIs aren't individual Medicare enrollees). Since we already SMTP-verified all 63k inboxes, add an 'institutional_verified' selector that trusts our own verification instead of reval-list presence. Result: OIG + NPPES-update now address 62,422 (43x more), giving multiple broad offers to test engagement on. - enrich_institutional_revalidation.py: fast local join of the institutional list to the CMS Revalidation Due Date List bulk file (revalidation_base.csv) by NPI -> adds reval_due_date/days_overdue/reval_status. ~1,437 are genuine Medicare enrollees (197 overdue / 164 due-soon) -> flagship $599 reval pitch. - npi_reactivation stays on leie_or_deactivated (only REAL deactivations -- no false 'your NPI is deactivated' claims to active orgs).	2026-06-14 01:07:40 -05:00
justin	3f7ecf9d13	hc: persist mx_provider on imported subscribers (per-operator audit) So we can verify/analyze the per-MX-operator throttle distribution from the listmonk DB after import (and re-throttle future segment membership).	2026-06-12 22:28:49 -05:00
justin	5237c81385	hc: per-MX-operator warmup throttle (spread load across receiving systems) Reputation is tracked per receiving mail operator, not per recipient domain, so the daily warmup slice is now distributed across MX operators with per-operator daily caps (ramping with the warmup day): Microsoft/Google/Proofpoint/etc. capped individually, long-tail operators each get a generous default. This lets total daily volume be much higher than a flat cap without hammering any single system. mx_throttled() respects the mx_provider column the verifier now writes; falls back to flat slicing if absent.	2026-06-12 22:09:29 -05:00
justin	6c8c823e5e	hc: refresh attribs when cross-adding an existing subscriber to a segment add_subscriber only attached an already-existing subscriber to the new list without updating attribs, so the due-soon template's days_until merge field was blank for providers already imported by another segment. Now PUT the merged attribs (existing + this segment's npi/practice/due-date/days_until) before adding to the list.	2026-06-12 19:37:01 -05:00
justin	c8c9a04c1d	hc: add 'revalidation due soon' warmup segment (proactive, grows supply) The HC warmup pool is supply-constrained (~400 verified providers, all fed by the same narrow 'revalidation 1-90 days OVERDUE' slice). This adds a mirror-image proactive segment that targets providers whose Medicare revalidation is UPCOMING within the next 1-90 days, drawn from the same CMS Revalidation Due Date List -- no new data source needed. 'Handle it before your deadline' is a strong pitch and roughly doubles the deliverable pool. - New selector reval_due_soon (status=upcoming, days_until in [HC_DUE_SOON_MIN, HC_DUE_SOON_MAX] default 1-90). - New segment revalidation_due_soon reusing the existing /order/npi-revalidation service ($599) with template hc_revalidation_due_soon.html. - attribs_for now exposes days_until (positive days to due date). - Added to ACTIVE_SEGMENTS.	2026-06-12 19:33:49 -05:00
justin	a78d60a127	hc: auto-reply for 'already revalidated' replies + permanent suppression A lead replied with proof their Medicare revalidation was already approved (CMS data-lag: the public Revalidation Due Date List still showed them overdue weeks after approval). Two of these arrived same-day, so: - Carbonio auto-reply (deployed on co.carrierone.com): created mailbox hc-replies@ on the info@ distribution list with a Sieve that auto-acknowledges 'my revalidation is already complete' replies (tag + mark read + file into a 'Reval Completed (auto-acked)' folder + on-brand reply explaining the CMS lag). CRITICAL: info@ is the shared reply-to for ALL campaigns (healthcare, trucking, telecom), so every rule is anchored to Medicare/revalidation context -- a trucking 'MCS-150 done, this is bogus' or telecom 'RMD done' reply does NOT trigger it (tested + passing). A buyer guard ('please file / how much') also suppresses the auto-reply so a human handles the sale. Carbonio 25.x Sieve quirks documented (vacation/imap4flags/body :text all unsupported; use reply/flag/tag/body :contains). - Permanent suppression: new data/hc_suppress.txt do-not-contact list the warmup honors at import AND --prune removes from the live lists. Seeded with the two completed providers (Pangea Lab, Yakima Valley FWC); both also blocklisted in listmonk_hc and removed from lists 3 + 4.	2026-06-08 10:37:49 -05:00
justin	9cb10b18e0	feat(hc): deliverability prune -- evict newly-Google-hosted subscribers Belt-and-suspenders for the edge you flagged: a domain already in a warmup list could flip its MX to Google Workspace between weekly refreshes, after which it would hard-bounce from the cold IP. The import-time guard only catches NEW adds. - prune_holdouts(): enumerates each warmup list's subscribers, matches them against the FRESH master CSV (re-classified weekly), and removes any whose domain is now Google-hosted. DELIVERABILITY-ONLY -- it never evicts for audience reasons (an overdue provider drifting out of the 1-90 day window was a valid target when warmed; re-litigating that just wastes warmup progress). - --prune (run alongside warming) and --prune-only (prune then exit). - Wired into the weekly refresh cron as a --prune-only chained step, so MX is re-checked and holdouts removed every Monday before the weekday sends. Verified end-to-end: with no Google domains in lists it's a 0-op; injecting a simulated Google-flipped domain into the master, the prune correctly detects and (in a real run) would remove it from every list it's on.	2026-06-08 03:39:56 -05:00
justin	54b92b1f06	fix(hc deliverability): MX-based Google-host exclusion during warmup Found via live mail.log: Google-Workspace-hosted PRACTICE domains (custom domains whose MX is aspmx.l.google.com, e.g. moosepharmacy.com, hc2kidney.com) were getting hard 550-5.7.1 rejects from Google's cold-IP bulk filter -- exactly the bounces that wreck a warming IP's reputation. The original google/non-google split classified by the email's domain STRING, which can't see that a custom domain silently uses Google Workspace; only an MX lookup reveals it (33% of our domains, 228/689, are Google-hosted this way). - hc_data_refresh.py: new MX classification (one lookup per unique domain via dnspython, cached) writes an mx_provider=google/other flag into the master and propagates it into the channel CSVs (auto-adding the column). --skip-mx for a fast status-only run. - build_healthcare_campaigns_cron.py: warm_segment now drops mx_provider=google rows during warmup (HC_SKIP_GOOGLE=1 default; set 0 once IPs are warm). This is defense-in-depth -- correct regardless of which CSV the cron is pointed at. Verified: today's sends (nongoogle CSV) had 0 Google bounces; the guard cuts the Google-containing week1_verified cohort's revalidation candidates 82->8.	2026-06-08 03:32:12 -05:00
justin	feb677f6ce	fix(hc warmup): only mail slightly-overdue providers (deliverability) Mailing heavily-overdue NPIs (months/years past due) risks hitting practices that have closed, merged, or abandoned the inbox -> hard bounces, which are the fastest way to wreck a warming IP's reputation. The warmup now restricts the reval_overdue selector to an inclusive [HC_OVERDUE_MIN, HC_OVERDUE_MAX] window (default 1-90 days) and the OIG 'any' selector likewise excludes heavily-overdue and dropped-off-list rows. On the current cohort this trims the overdue audience 178->96 and the OIG audience 399->317, holding out the stale long tail (181-365d + 366d+). upcoming/active providers are unaffected.	2026-06-08 03:27:22 -05:00
justin	c79a7715e1	fix(hc): bugs found in self-audit of the new refresh + warmup + templates Refresh (hc_data_refresh.py): - CRITICAL: drop optout_ending from REFRESHED_FIELDS -- the refresh never computes it, so propagating it blanked the channel CSVs and would starve the compliance_bundle segment (whose selector IS optout_ending). - MAJOR: only rewrite leie_excluded when OIG was actually pulled (guard was 'not skip_oig OR not skip_sam', so a --skip-oig run blanked all exclusion flags). Also write 'Y' (matching the original list builder) not '1'. - Use 'no_reval_flag' (the original vocabulary) instead of 'not_on_list' when an NPI drops off the reval list, and clear reval_due_date too. - Throttle politeness: move time.sleep(0.05) above the early-continue paths so EVERY CMS request is spaced, not just the minority that are on the list. - Guard blank-NPI rows (leave their status untouched instead of mislabeling). - Master write preserves any columns beyond HEADER (no silent column drop). Warmup cron (build_healthcare_campaigns_cron.py): - Fix the daily-slice split: it summed to less than the budget (dropped ~2/day) and could OVERSHOOT on tiny totals (each 'other' floored to >=1). Now uses divmod for an even remainder and reclaims rounding onto the lead, so sum(per_seg) == total_slice exactly for every input (verified 0,1,2,7,100,300). Templates: the non-revalidation emails rendered {{ .Subscriber.Attribs.detail }} (a reval due date) under a 'Practice'/'Status'/'Record' label -- a wrong/ confusing personalization on a live send (esp. OIG, selector 'any'). All four now show the practice name; 'detail' is retired from rendering (revalidation uses reval_due_date/days_overdue directly).	2026-06-08 03:23:47 -05:00
justin	4f455475c0	hc: weekly data-refresh pipeline + multi-segment warmup cron Two gaps closed: 1. hc_data_refresh.py (NEW): weekly source-data refresh. Re-checks every emailable NPI against the LIVE government sources so sends never go stale: - CMS Revalidation Due Date List (data.cms.gov per-NPI API; handles both ISO and US date formats, normalizes to MM/DD/YYYY). - OIG LEIE full CSV download (the NPI-bearing exclusion source). - SAM.gov v4 exclusions (key in .secrets/sam-api-key) -- OFF by default since SAM exclusions rarely carry an NPI and the full set is ~167k records; it's opt-in via --sam-pages. SAM's real value is the live per-name screening service, not a bulk NPI join. Writes the master CSV atomically (temp+rename). A provider who has since revalidated flips overdue->upcoming/not_on_list, so we stop nagging them. 2. build_healthcare_campaigns_cron.py: was revalidation-only (one hardcoded list/campaign/CSV/template). Now multi-segment: imports SEGMENTS from the single-source-of-truth registry, warms ALL five programs in parallel, each with its own list, dated campaign, and per-segment import-state file (so dedup is per-segment). A per segment maps master-CSV rows to the right program (reval_overdue / reval_upcoming / leie_or_deactivated / optout_ending / any). Daily ramp slice is split across segments (revalidation leads at 50%, rest share the remainder) so every program collects engagement data while the IPs warm. Back-compat: seeds revalidation import-state from the legacy hc_imported_emails.txt once.	2026-06-08 03:06:29 -05:00
justin	483f185861	feat(healthcare): prove revalidation is real via official CMS data + self-verify Skepticism ("is this even real?") is the top objection. The data IS accurate (verified our subscribers' NPIs match the official CMS Revalidation Due Date List exactly), so this is a credibility-presentation fix: 1. Email: replace the plain detail row with an "Official record - CMS Medicare Revalidation Due Date List" card (NPI, legal name, due date, days overdue) plus a "Verify on CMS.gov" button. Clearly labeled as our presentation of public CMS data, not a CMS screenshot (no impersonation). 2. API: npi/lookup now pulls the revalidation due date LIVE from the public CMS dataset (data.cms.gov) instead of the empty local table, and returns a revalidation{ due_date, source, cms_legal_name, verify_url } proof object. 3. Tool: /tools/npi-compliance-check shows a live "official record" card with a self-verify link when CMS returns a due date. Builder now stores reval_due_date/days_overdue as separate attribs for the card (existing 194 subscribers backfilled from their detail string).	2026-06-07 23:54:01 -05:00
justin	4233c90a4f	hc email: reframe value-add to 'No 2FA. No government portals.' (we have a portal; the pain is CMS 2FA/identity-proofing); cron creates fresh dated campaign when prior is finished; add hc bounce watcher (Postfix->listmonk-hc webhook, hard/complaint->blocklist)	2026-06-06 16:47:12 -05:00
justin	95698852ce	healthcare warmup: gate Google/Workspace domains out of week 1 (they hard-reject cold IPs 550-5.7.1); send 501 non-Google practice domains first, defer 222 Google to week 2-3; cron uses hc_warmup_nongoogle.csv	2026-06-06 04:02:00 -05:00
justin	2bc86268f7	healthcare: HC warmup campaign cron (Mon-Fri 7AM Central) - imports overdue-first verified slice into listmonk-hc + runs Medicare-revalidation campaign via hc HOT stream; rate-throttled by pw-hc-rampcap	2026-06-06 03:57:08 -05:00

20 commits