Commit graph

303 commits

Author SHA1 Message Date
justin
386467bedf mcs150: trim FMCSA instruction pages from form templates
The official MCS-150/150B/150C PDFs ship with 8 (150/150B) or 4 (150C)
FMCSA instruction/example pages before the actual fillable form. We were
generating + faxing/submitting all of them. Trimmed the source templates
down to the FORM pages only:
  MCS-150  11 -> 3 pages (289 fields preserved)
  MCS-150B 12 -> 4 pages (349 fields preserved)
  MCS-150C  6 -> 2 pages (33 fields preserved)

The filler iterates writer.pages (no absolute index) and signature
anchors are derived dynamically via enumerate(reader.pages), so no
page-specific markup needed fixing. Removed one-off diag script.
2026-06-10 13:25:07 -05:00
justin
4447905864 workers: don't re-upload handler-returned MinIO keys as local files
handle_process_compliance_service assumed handlers return local temp
paths and re-uploaded each to MinIO. The MCS-150 handler uploads itself
and returns the MinIO key, so the re-upload tried to read a nonexistent
local file and logged a 'File not found' error after the order was
already correctly held at the admin gate. Now we skip files that don't
exist locally and keep the returned key as-is.
2026-06-10 12:47:16 -05:00
justin
71404810c4 mcs150: backfill signer name from signed cert + re-stamp signed form
- When intake lacks signer_name, backfill it from the name the client
  typed when signing the perjury certification (that name is exactly what
  belongs in the form's print/type-name field, certifyName).
- After a client-approved re-dispatch, re-point the signed esign record at
  the freshly filled form and re-stamp the signature, so the signed PDF an
  admin reviews reflects the current complete form (not a stale earlier
  fill). Field layout (and thus signature anchors) is unchanged across
  fills, so the recorded anchor coordinates stay valid.
2026-06-10 12:41:27 -05:00
justin
bd0d8c0e2c verify_mcs150: correct handler class name 2026-06-10 12:39:41 -05:00
justin
79c460ef25 mcs150: render verification harness + auto-generate appearance streams
- fill_mcs150 now uses auto_regenerate=True so pypdf writes appearance
  streams for every text field. Preview/Chrome ignore /NeedAppearances and
  were showing blank widgets over the values; generated /AP streams make
  the text render in all viewers.
- New verify_mcs150.py reads each widget's /AP /N appearance stream (the
  literal drawn glyphs) to confirm expected values actually render, since
  the container has no OCR/raster tooling. Exits non-zero on any miss.
2026-06-10 12:37:25 -05:00
justin
d5e66786a2 mcs150: enrich intake from FMCSA carrier census before PDF fill
The MCS-150 biennial update re-confirms the carrier's existing FMCSA
record. Previously the PDF filler only had whatever the intake form
collected; rescued/sparse orders (or orders where the carrier's data
lives in FMCSA, not the intake) produced near-empty forms. Now we pull
the carrier census (legal name, address, EIN, fleet counts) from the
FMCSA carrier API and merge it under any customer-provided intake values
(customer edits win), so the form is pre-filled with the carrier's
current registered data. Refactored the FMCSA fetch into a shared
_fetch_fmcsa_carrier helper used by both enrichment and status check.
2026-06-10 12:35:52 -05:00
justin
7e5946d65a fix(mcs150 pdf): set /NeedAppearances so filled field values actually render
Customer saw the MCS-150 looking blank / 'data covered by the form fields': the
values were correctly written to the AcroForm /V, but pypdf left the template's
empty /AP appearance streams in place and NeedAppearances was false, so viewers
rendered the blank widget over the value. Setting AcroForm /NeedAppearances=true
makes viewers regenerate appearances from the values. (The missing signature was
a downstream effect of the separate fobj_put MinIO-upload bug, now fixed -- with
no PDF in MinIO the anchor extraction + signature stamping both failed.)
2026-06-10 12:20:45 -05:00
justin
b28dda7c5a feat(telegram): include a presigned PDF view link in the admin-review alert
When an MCS-150/USDOT order hits the pre-submission admin-verification gate, the
Telegram FULFILLMENT NEEDED alert now appends a presigned link to the prepared
PDF (via the public minio.performancewest.net endpoint, IP-allowlisted to admin)
so you can review the document straight from the alert before approving. Added
notify_fulfillment_todo(view_url=...) + a _presigned_view_url helper (public
endpoint + explicit region to avoid the region-probe that 403s from the worker).
2026-06-10 12:13:43 -05:00
justin
09d928a582 fix(mcs150): MinIO upload used wrong method fobj_put -> fput_object
The MCS-150/USDOT PDF was generated fine but the MinIO upload threw 'Minio object
has no attribute fobj_put' (wrong method name + signature), so the prepared filing
PDF was never persisted -- nothing for an admin to review at the verification gate,
and the esign-completed re-dispatch failed with 'File not found'. Use the correct
minio fput_object(bucket, key, file_path). Affects every MCS-150/USDOT filing.
2026-06-10 12:08:12 -05:00
justin
058d4d426a feat(compliance): admin verification gate + durable submission evidence
Per request: after the customer signs but BEFORE we submit to the government, hold
the order for a human to verify the prepared filing is correct.

- MCS-150 handler (mcs150-update + usdot-reactivation): new admin-verification gate
  after the signature gate -- if not admin_approved, set fulfillment_status=
  'ready_to_file', create a HIGH-priority 'VERIFY before filing' admin todo, and
  STOP (no FMCSA submission). job_server injects admin_approved from the dispatch
  payload (mirrors client_approved).
- New admin endpoint POST /api/v1/admin/compliance-orders/:id/approve-submit
  (requireAdmin): verifies status=ready_to_file, re-dispatches the worker with
  admin_approved=true to proceed to actual submission.
- Durable submission EVIDENCE: the web/fax submitters only wrote confirmation
  screenshots to an ephemeral temp dir. Now _upload_submission_evidence copies the
  FMCSA confirmation screenshot + attested PDF + fax_log_id to MinIO under
  filings/<slug>/<order>/evidence/ and records the keys on the order, so we keep
  proof of every government submission.

(state-trucking + the FCC handlers already gate via admin todos / auto_filing.py;
this brings MCS-150 to parity and adds evidence retention.)
2026-06-09 22:50:30 -05:00
justin
e87715aee7 fix(portal): onboarding/login links last 7 days, not 60 min
The rescue onboarding emails hardcoded a 60-minute expiry -- way too short for a
paid customer who hasn't engaged yet (they may not check email for hours/days),
so Paul's and Mitchell's links expired before they used them. Onboarding links
now last 7 days (ONBOARDING_TTL_MINUTES); the standard security password-RESET
window bumped 30min -> 2h. Re-issued fresh 7-day links to all 3 affected
customers (none had set a password yet) via reissue-onboarding-links.mjs, cc'd.
2026-06-09 22:50:09 -05:00
justin
a6d2f10149 chore: Mark Adams rescue (3rd real customer stuck on the login bug)
Paid Jun 1 for MCS-150 (card), no customers row -> couldn't log in -> no intake ->
filing stuck 'NEEDS MANUAL FILING'. Created his customers row + sent login +
intake email (cc justin). All 3 real paying customers now rescued; the underlying
card/PayPal login bug is fixed so new orders won't hit this.
2026-06-09 22:34:52 -05:00
justin
1c2e263bb7 warmup(ip-rehab): bias recipients to multi-subscriber business domains (cut bounce)
Day-0 batch saw ~45% bounce because 'no listmonk bounce record' is a weak
liveness signal. Now require the recipient's domain to have >=2 enabled
subscribers (a real org, not a one-off typo'd address), which materially lowers
the dead-mailbox bounce rate on the recovering IPs.
2026-06-09 20:31:45 -05:00
justin
25f4a7503b warmup: IP rehab for .91-.93 so they can be reallocated
The 3 IPs (mta02-04 / .91-.93) retired after the May 30-31 over-volume blast are
NOT on any DNSBL (Spamhaus/Barracuda/SpamCop/SORBS all clean) and have clean PTRs
+ SPF/DKIM/DMARC -- the damage was provider-internal reputation, which recovers
with slow clean sending. scripts/ip_rehab.py sends a tiny ramping trickle
(10/IP/day -> cap 60) of genuine CAN-SPAM-compliant compliance check-in mail to
clean business-domain, never-bounced recipients via dedicated heavily-throttled
postfix transports rehab02/03/04 (30s/msg, bound to .91/.92/.93). Routing uses an
X-PW-Rehab-IP header + header_checks FILTER to override the transport_maps randmap
warmup rotation (verified: mail routes via rehab transports, status=sent). Daily
cron pw-ip-rehab. After ~2-3 weeks of clean sending the IPs can be reallocated.
2026-06-09 20:27:47 -05:00
justin
6f361d307d warmup: add Microsoft consumer (hotmail/outlook/live/msn) to cold-mail exclusions
Main-pool delivery was stuck ~60-67% (897+ daily spam-blocks). Audit of the main
listmonk found ~47k consumer mailboxes still enabled in OLD cold trucking lists
(built before the exclusion existed): 19k gmail, 27.5k yahoo-family + Microsoft
consumer. gmail alone was 101 of today's 909 550-5.7.1 blocks. Blocklisted all
~47k consumer addresses NOT in the protected FCC lists 3/6 (per the keep-gmails
decision for those warm lists). Also added Microsoft consumer domains to
BLOCKED_EMAIL_DOMAINS so the daily trucking builder stops re-adding them
(Microsoft SmartScreen silently junks/defers cold B2B mail -- a reputation drag
even though it doesn't 550 like Google). Enabled base 95455 -> 48707 (real
business/ISP domains).
2026-06-09 20:10:54 -05:00
justin
3dbf5b3bb2 chore: Mitchell Allen rescue scripts (customers row, SO backfill, re-dispatch, login+signature email) 2026-06-09 14:58:52 -05:00
justin
e2467752dc chore: export ensureComplianceSalesOrder for rescue/backfill use 2026-06-09 14:44:28 -05:00
justin
68e6b60951 fix: worker emails (localhost:25 -> SMTP relay) + create ERPNext SO on webhook payment
Two bugs found tracing Mitchell Allen's batch CB-95BA6C90 (5 DOT services, card):

1) Worker authorization/signing-link/status emails were sent via
   smtplib.SMTP('localhost', 25), which has no MTA in the workers container ->
   every send failed '[Errno 111] Connection refused', so customers never got
   their e-sign links and orders sat 'awaiting client signature' forever. Routed
   all 9 hardcoded localhost:25 sites (state_trucking, mcs150_update, boc3_filing,
   hazmat_phmsa, mailbox_setup, dot_esign, completion_emails) through the
   authenticated SMTP relay (SMTP_HOST/PORT/STARTTLS/login) + added a shared
   worker_email.send_worker_email helper.

2) The ERPNext Sales Order for compliance/compliance_batch was only created in
   the /checkout/create-session endpoint, but CARD orders confirm via the Stripe
   WEBHOOK -> handlePaymentComplete, which never created the SO. Result: every
   webhook-confirmed order had erpnext_sales_order=NULL and workers logged
   'Sales Order not found 404' then built from PG. Added idempotent
   ensureComplianceSalesOrder() to handlePaymentComplete so ALL payment methods
   (card-webhook, PayPal, crypto) create + link the SO.
2026-06-09 14:40:46 -05:00
justin
220f301453 test(e2e): fix compliance_orders seed columns (no total_cents); regression PASS
e2e-paypal-portal-fix.mjs now passes against live prod: completing a compliance
order creates the customers row (id, name=E2E Tester, company from intake_data,
no password) -> customer can register/reset + log in. PayPal login bug fixed.
2026-06-09 14:35:04 -05:00
justin
3c65dd8748 fix(checkout): pull company from intake_data (compliance has no customer_company col)
compliance_orders stores company in intake_data JSON, not a column; read it from
there (company/legal_name/entity_name) with graceful fallback. Fix e2e test seed
accordingly.
2026-06-09 14:31:42 -05:00
justin
9987b1e30d fix(checkout): create Postgres customers row on order completion (PayPal login bug)
Portal login + forgot-password read the Postgres customers table (bcrypt), NOT
ERPNext. ensureCompliancePortalUser (the common path for Stripe/PayPal/crypto via
handlePaymentComplete) only provisioned the ERPNext customer/website-user and
never created the customers row -- so customers (notably PayPal, who reach this
path directly) had no account to log into or reset a password against. Now upserts
the customers row (no password; ON CONFLICT keeps any existing hash) with name +
company so they can register/reset and log in immediately.

Also: narrowed the placeholder-email skip from 'any synthetic@ or pipeline.com' to
exactly 'synthetic@pipeline.com' (the FMCSA-census placeholder) so real customers
on those real consumer domains aren't wrongly skipped -- which is what bit Paul
Wilson. Added cc support to sendEmail. e2e-paypal-portal-fix.mjs is the regression
test (seeds a compliance order, runs handlePaymentComplete, asserts the customers
row is created). Rescue scripts for the affected customer included.
2026-06-09 14:28:19 -05:00
justin
20c11e6180 fix(formation/NV): name search returns unknown (admin-verify), not a fake result
Nevada's entity search (esos.nv.gov/SilverFlume) is behind Imperva Incapsula bot
protection that blocks headless + the residential proxy IPs, and NV has no public
open-data API/bulk dataset. The old adapter scraped the blocked challenge page and
returned available=False for everything (would tell customers a free name is taken)
and also had a NameError bug (f"${CODE}..."). Now NV detects the Incapsula
challenge and returns available=None ('could not determine') with a note to verify
manually on SilverFlume -- never a false 'taken', so it never wrongly blocks an
order. TX remains fully automated via the open-data API.
2026-06-09 08:41:22 -05:00
justin
f94ad1682b fix(formation/TX): name search via Texas open-data API, not scraping
The TX Comptroller web search is now a JS form (old input#entityName selector
dead) and SOSDirect is login-gated, so the scraper returned garbage. Replaced
search_name with the Texas Socrata 'Active Franchise Taxpayers' dataset
(data.texas.gov/resource/9cir-efmm.json) over SoQL -- free, no-auth, no-login,
no bot-blocks. Exact normalized match => unavailable; no rows => available; API
error => available=None (never a false 'taken'). Verified: unique name = 0 rows
(available), 'APPLE INC.' = exact match (taken).
2026-06-09 08:34:37 -05:00
justin
76c4d55603 fix(formation): name-search returns null (not false) on adapter error
E2E harness exposed that the NV name-search adapter times out on a stale input
selector and search_name() swallowed the error and returned available=False --
i.e. it would tell customers an AVAILABLE name is taken. Now returns
available=None ('could not determine') on adapter error / unknown state, which the
API already maps to null. The NV/TX portal selectors still need a scraping fix
(separate task; e2e harness is the acceptance test) before enabling a self-serve
formation checkout. Documented full e2e results + the bugs caught (missing ERPNext
Items, entity_type case, missing /name-search route) in the readiness doc.
2026-06-09 08:06:43 -05:00
justin
4c0decd175 fix(formation): add working /name-search worker route + e2e harness
Two latent bugs the e2e harness caught:
1. api entities.ts GET /states/:code/name-search calls WORKER_URL/name-search,
   but job_server had NO such route -> 404 -> silently fell back to stale
   entity_cache on every live name check. Added a synchronous /name-search route
   returning {available,exact_match,similar_names,state}.
2. both the new route AND the existing async handle_name_search imported a
   nonexistent search_name_sync(); fixed to drive the real async search_name()
   via an event loop (same pattern as /entity-status).

scripts/e2e-formation-order.mjs: replays the real formation order flow (live
name search -> formation_orders insert -> ERPNext customer + Sales Order with
BUSINESS-FORMATION + STATE-FILING-FEE line items -> verify SO total + DB linkage
-> cleanup) without a real Stripe charge or state filing. Run in the api container.
Also created the missing ERPNext Items (BUSINESS-FORMATION, STATE-FILING-FEE,
FOREIGN-QUAL-SINGLE/MULTI) that the formation SO references.
2026-06-09 07:51:54 -05:00
justin
37393e5bbc scripts(otc): dedupe by CIK; commit the 861-company lead list
The master file lists warrants/units as separate tickers under one CIK, so the
pull now dedupes to one row per company (other tickers kept in all_tickers).

data/otc_leads.csv: 861 unique active US-domestic microcap OTC issuers
(<$75M float, all actively filing, 100% with business address + phone). By
incorporation: DE 365, NV 325 (DE+NV=690 = the reincorporation targets), WY 44,
FL 39, MD 38. Dropped from the 2,771 OTC universe: 1,672 foreign, 62
accelerated/large filers, 73 delinquent/dark. EDGAR has no email -> phone +
address captured for enrichment / direct mail / call.
2026-06-09 07:10:54 -05:00
justin
1b3cbf2fbf scripts(otc): SEC EDGAR lead-pull + reincorporation-destination research
scripts/otc_lead_pull.py: pulls company_tickers_exchange.json + per-issuer
submissions/CIK*.json from SEC EDGAR (10 req/sec, declared User-Agent), filters
to active US-domestic microcaps (drops foreign ADRs, accelerated/large filers
that keep counsel on retainer, and delinquent/dark shells), writes a CSV with
state of incorporation, business/mailing address, phone, SIC, filer size bucket,
last filing date. DE/NV prioritized.

docs 4c: where companies actually reincorporate TO -- Nevada #1 (281 filings),
Texas the fast riser (99 all-time, but 27 vs NV 33 since 2024), Florida modest
(23), Wyoming niche (8). Lead with 'leaving Delaware?' and let client pick
NV/TX/FL; same flat-fee conversion productizes across all three.
2026-06-09 06:58:01 -05:00
justin
90bccfda32 fix(checkout): route dot-new-carrier-bundle on success page + worker pipeline
Follow-on to the trucking new-carrier slug fix:
- success page: add dot-new-carrier-bundle to DOT_SLUGS + NEW_CARRIER_SLUGS so
  the order-confirmation 'what to expect' messaging classifies it as trucking.
- pipeline_orchestrator: the trucking onboarding PIPELINE was keyed under the
  bare 'new-carrier-bundle' slug, which is the TELECOM bundle's slug (also a
  collision at the worker layer). Re-keyed to 'dot-new-carrier-bundle' so a
  trucking bundle never runs the telecom pipeline (and vice versa).
2026-06-08 23:48:56 -05:00
justin
09b32d6e75 fix(bounce-watchers): QID regex broke on out05-09/hcout transports
Both Postfix bounce watchers extracted the queue-ID with regex
postfix/[a-z]*[ (main) and postfix/[a-z0-9]*[ (hc), which only matched simple
transports like qmgr/cleanup. When the MTA started rotating sends through the
numbered warmup transports (out05-09/smtp, hcout1-3/smtp -- a transport/subprocess
name WITH a slash), the QID extraction returned empty, so status=bounced lines
never matched a tracked campaign QID and NO bounces were posted to listmonk since
~May 31. Result: dead mailboxes never got blocklisted and kept being retried,
dragging the warmup bounce rate.

Fix: regex postfix/[^[]*[ matches any transport/subprocess name up to the [pid].
Verified live on both watchers (out07/smtp and hcout1/smtp test bounces now
detected + posted).

Separately (server-side, not in repo): listmonk bounce.actions was null (no
auto-action), so even posted bounces did nothing. Set hard=1->blocklist,
complaint=1->blocklist; verified a real bounce now records + blocklists.
2026-06-08 15:02:32 -05:00
justin
b973c6c132 exclusions: block Google consumer mailboxes (gmail) from cold/warmup sends
Warmup audit (2026-06-08) found the main sending pool was eating a 37% bounce
rate, and 556 of those were Google 550-5.7.1 'likely unsolicited mail' spam
blocks -- of which gmail.com alone was 427 (77%). Google's cold-IP filter is the
strictest of the big providers and consumer gmail has the highest complaint
sensitivity, so mailing it from a warming IP is pure reputation damage.

Added GOOGLE_CONSUMER_DOMAINS (gmail.com, googlemail.com) to BLOCKED_EMAIL_DOMAINS,
which the daily trucking builder already enforces in its recipient SQL
(lower(domain) <> ALL(blocked)). Takes effect on the next nightly build.

Custom domains silently on Google Workspace are a smaller (~5%) MX-only signal,
already handled in the healthcare builder via the mx_provider flag; can be ported
to the main pool later if the residual warrants it.
2026-06-08 13:50:27 -05:00
justin
a78d60a127 hc: auto-reply for 'already revalidated' replies + permanent suppression
A lead replied with proof their Medicare revalidation was already approved (CMS
data-lag: the public Revalidation Due Date List still showed them overdue weeks
after approval). Two of these arrived same-day, so:

- Carbonio auto-reply (deployed on co.carrierone.com): created mailbox
  hc-replies@ on the info@ distribution list with a Sieve that auto-acknowledges
  'my revalidation is already complete' replies (tag + mark read + file into a
  'Reval Completed (auto-acked)' folder + on-brand reply explaining the CMS lag).
  CRITICAL: info@ is the shared reply-to for ALL campaigns (healthcare, trucking,
  telecom), so every rule is anchored to Medicare/revalidation context -- a
  trucking 'MCS-150 done, this is bogus' or telecom 'RMD done' reply does NOT
  trigger it (tested + passing). A buyer guard ('please file / how much') also
  suppresses the auto-reply so a human handles the sale.
  Carbonio 25.x Sieve quirks documented (vacation/imap4flags/body :text all
  unsupported; use reply/flag/tag/body :contains).

- Permanent suppression: new data/hc_suppress.txt do-not-contact list the warmup
  honors at import AND --prune removes from the live lists. Seeded with the two
  completed providers (Pangea Lab, Yakima Valley FWC); both also blocklisted in
  listmonk_hc and removed from lists 3 + 4.
2026-06-08 10:37:49 -05:00
justin
9cb10b18e0 feat(hc): deliverability prune -- evict newly-Google-hosted subscribers
Belt-and-suspenders for the edge you flagged: a domain already in a warmup list
could flip its MX to Google Workspace between weekly refreshes, after which it
would hard-bounce from the cold IP. The import-time guard only catches NEW adds.

- prune_holdouts(): enumerates each warmup list's subscribers, matches them
  against the FRESH master CSV (re-classified weekly), and removes any whose
  domain is now Google-hosted. DELIVERABILITY-ONLY -- it never evicts for
  audience reasons (an overdue provider drifting out of the 1-90 day window was
  a valid target when warmed; re-litigating that just wastes warmup progress).
- --prune (run alongside warming) and --prune-only (prune then exit).
- Wired into the weekly refresh cron as a --prune-only chained step, so MX is
  re-checked and holdouts removed every Monday before the weekday sends.

Verified end-to-end: with no Google domains in lists it's a 0-op; injecting a
simulated Google-flipped domain into the master, the prune correctly detects and
(in a real run) would remove it from every list it's on.
2026-06-08 03:39:56 -05:00
justin
54b92b1f06 fix(hc deliverability): MX-based Google-host exclusion during warmup
Found via live mail.log: Google-Workspace-hosted PRACTICE domains (custom
domains whose MX is aspmx.l.google.com, e.g. moosepharmacy.com, hc2kidney.com)
were getting hard 550-5.7.1 rejects from Google's cold-IP bulk filter -- exactly
the bounces that wreck a warming IP's reputation. The original google/non-google
split classified by the email's domain STRING, which can't see that a custom
domain silently uses Google Workspace; only an MX lookup reveals it (33% of our
domains, 228/689, are Google-hosted this way).

- hc_data_refresh.py: new MX classification (one lookup per unique domain via
  dnspython, cached) writes an mx_provider=google/other flag into the master and
  propagates it into the channel CSVs (auto-adding the column). --skip-mx for a
  fast status-only run.
- build_healthcare_campaigns_cron.py: warm_segment now drops mx_provider=google
  rows during warmup (HC_SKIP_GOOGLE=1 default; set 0 once IPs are warm). This is
  defense-in-depth -- correct regardless of which CSV the cron is pointed at.

Verified: today's sends (nongoogle CSV) had 0 Google bounces; the guard cuts the
Google-containing week1_verified cohort's revalidation candidates 82->8.
2026-06-08 03:32:12 -05:00
justin
feb677f6ce fix(hc warmup): only mail slightly-overdue providers (deliverability)
Mailing heavily-overdue NPIs (months/years past due) risks hitting practices
that have closed, merged, or abandoned the inbox -> hard bounces, which are the
fastest way to wreck a warming IP's reputation. The warmup now restricts the
reval_overdue selector to an inclusive [HC_OVERDUE_MIN, HC_OVERDUE_MAX] window
(default 1-90 days) and the OIG 'any' selector likewise excludes heavily-overdue
and dropped-off-list rows. On the current cohort this trims the overdue audience
178->96 and the OIG audience 399->317, holding out the stale long tail
(181-365d + 366d+). upcoming/active providers are unaffected.
2026-06-08 03:27:22 -05:00
justin
c79a7715e1 fix(hc): bugs found in self-audit of the new refresh + warmup + templates
Refresh (hc_data_refresh.py):
- CRITICAL: drop optout_ending from REFRESHED_FIELDS -- the refresh never
  computes it, so propagating it blanked the channel CSVs and would starve the
  compliance_bundle segment (whose selector IS optout_ending).
- MAJOR: only rewrite leie_excluded when OIG was actually pulled (guard was
  'not skip_oig OR not skip_sam', so a --skip-oig run blanked all exclusion
  flags). Also write 'Y' (matching the original list builder) not '1'.
- Use 'no_reval_flag' (the original vocabulary) instead of 'not_on_list' when an
  NPI drops off the reval list, and clear reval_due_date too.
- Throttle politeness: move time.sleep(0.05) above the early-continue paths so
  EVERY CMS request is spaced, not just the minority that are on the list.
- Guard blank-NPI rows (leave their status untouched instead of mislabeling).
- Master write preserves any columns beyond HEADER (no silent column drop).

Warmup cron (build_healthcare_campaigns_cron.py):
- Fix the daily-slice split: it summed to less than the budget (dropped ~2/day)
  and could OVERSHOOT on tiny totals (each 'other' floored to >=1). Now uses
  divmod for an even remainder and reclaims rounding onto the lead, so
  sum(per_seg) == total_slice exactly for every input (verified 0,1,2,7,100,300).

Templates: the non-revalidation emails rendered {{ .Subscriber.Attribs.detail }}
(a reval due date) under a 'Practice'/'Status'/'Record' label -- a wrong/
confusing personalization on a live send (esp. OIG, selector 'any'). All four
now show the practice name; 'detail' is retired from rendering (revalidation
uses reval_due_date/days_overdue directly).
2026-06-08 03:23:47 -05:00
justin
85dc3d5c3b hc refresh: propagate fresh status into the channel CSVs the cron reads
The channel CSVs (hc_warmup_nongoogle/google/week1_verified) are email-keyed
subsets of the master with extra deliverability columns (verify_ok/verify_reason).
The refresh now writes the fresh status fields (reval_due_date, days_overdue,
reval_status, leie_excluded, optout_ending, name/specialty/state) back into each,
preserving the extra columns and row membership, so a single weekly run updates
everything the campaign cron consumes -- not just the master.
2026-06-08 03:13:00 -05:00
justin
4f455475c0 hc: weekly data-refresh pipeline + multi-segment warmup cron
Two gaps closed:

1. hc_data_refresh.py (NEW): weekly source-data refresh. Re-checks every
   emailable NPI against the LIVE government sources so sends never go stale:
   - CMS Revalidation Due Date List (data.cms.gov per-NPI API; handles both ISO
     and US date formats, normalizes to MM/DD/YYYY).
   - OIG LEIE full CSV download (the NPI-bearing exclusion source).
   - SAM.gov v4 exclusions (key in .secrets/sam-api-key) -- OFF by default since
     SAM exclusions rarely carry an NPI and the full set is ~167k records; it's
     opt-in via --sam-pages. SAM's real value is the live per-name screening
     service, not a bulk NPI join.
   Writes the master CSV atomically (temp+rename). A provider who has since
   revalidated flips overdue->upcoming/not_on_list, so we stop nagging them.

2. build_healthcare_campaigns_cron.py: was revalidation-only (one hardcoded
   list/campaign/CSV/template). Now multi-segment: imports SEGMENTS from the
   single-source-of-truth registry, warms ALL five programs in parallel, each
   with its own list, dated campaign, and per-segment import-state file (so
   dedup is per-segment). A  per segment maps master-CSV rows to the
   right program (reval_overdue / reval_upcoming / leie_or_deactivated /
   optout_ending / any). Daily ramp slice is split across segments (revalidation
   leads at 50%, rest share the remainder) so every program collects engagement
   data while the IPs warm. Back-compat: seeds revalidation import-state from the
   legacy hc_imported_emails.txt once.
2026-06-08 03:06:29 -05:00
justin
0b0ff9d311 hc campaigns: make the HTML templates the single source of truth
build_healthcare_campaigns.py had a divergent inline HTML generator (old teal-
header + yellow issue-box layout, missing the official-record card and the per-
segment verify-it-yourself blocks) that nobody called -- the live cron reads the
hand-tuned data/hc_campaigns/hc_*.html files directly. Removed the dead
generator + cmd_render(); render() now READS the canonical template file so the
files can't drift from a parallel generator. SEGMENTS is now a metadata registry
(subject, template, cta_path, price, list_name, campaign_name, selector) that the
multi-segment cron will consume. Verified --list and that send-test still reads
the real bodies.
2026-06-08 02:57:49 -05:00
justin
483f185861 feat(healthcare): prove revalidation is real via official CMS data + self-verify
Skepticism ("is this even real?") is the top objection. The data IS accurate
(verified our subscribers' NPIs match the official CMS Revalidation Due Date List
exactly), so this is a credibility-presentation fix:

1. Email: replace the plain detail row with an "Official record - CMS Medicare
   Revalidation Due Date List" card (NPI, legal name, due date, days overdue)
   plus a "Verify on CMS.gov" button. Clearly labeled as our presentation of
   public CMS data, not a CMS screenshot (no impersonation).
2. API: npi/lookup now pulls the revalidation due date LIVE from the public CMS
   dataset (data.cms.gov) instead of the empty local table, and returns a
   revalidation{ due_date, source, cms_legal_name, verify_url } proof object.
3. Tool: /tools/npi-compliance-check shows a live "official record" card with a
   self-verify link when CMS returns a due date.

Builder now stores reval_due_date/days_overdue as separate attribs for the card
(existing 194 subscribers backfilled from their detail string).
2026-06-07 23:54:01 -05:00
justin
a732423f04 fix(deploy): port catalog generator + drift-check to Python (prod has no node)
The host-side generator ran 'node scripts/*.mjs' in deploy.sh, but the prod box
has python3 only (no node outside containers), so the site deploy failed at the
generation step. Reimplemented both in Python (byte-identical output, verified
via diff against the node version; matches scripts/sync_nav.py tooling).
2026-06-07 19:26:01 -05:00
justin
09e21a6c97 refactor(pricing): single source of truth for the service catalog
Previously two hand-maintained price lists (API COMPLIANCE_SERVICES + site
SERVICE_META) drifted apart -- that is how the healthcare +$200 raise charged
$399 while displaying $599. Eliminate the drift class entirely:

- Move the catalog to api/src/service-catalog.ts (the authority; checkout
  charges from it). compliance-orders.ts imports it.
- scripts/gen-service-catalog.mjs generates site/src/lib/service-catalog.generated.ts
  from the API source. intake_manifest.ts re-exports SERVICE_META from it, so all
  ~60 site pages keep working unchanged.
- deploy.sh regenerates + drift-checks before building (site build context is
  ./site only and cannot read ../api, so generation happens host-side).
- scripts/check-service-catalog-drift.mjs fails the build if the generated file
  ever diverges from the API (verified: passes aligned, fails on mismatch).

To change a price now, edit ONE file: api/src/service-catalog.ts.
2026-06-07 19:11:34 -05:00
justin
a4d67bcf9b hc-warmup: add list-hygiene script (drop undeliverable addrs, smtp_valid first)
Keeps only deliverable addresses (smtp_valid + catch_all_detected), drops
mx_unreachable + smtp_unknown rejects that defer/bounce and damage the warming
HC IP reputation. Sorts smtp_valid first so the daily slice hits verified
mailboxes first. Used to clean hc_warmup_nongoogle.csv (501 -> 399 rows).
2026-06-07 18:08:36 -05:00
justin
e5db147319 esign: make signing copy fully generic - remove all ink references from website/API
Client-facing and website code now describes only a generic per-document signing
authorization; nothing visible to signers or recorded in the website/API code or
DB schema references ink, paper, reproduction, or any fulfillment mechanics.

- rename esign-ink-consent.ts -> esign-sign-consent.ts; INK_CONSENT_TEXT ->
  SIGN_CONSENT_TEXT (generic: 'use my signature to complete and submit this
  single filing', no ink/paper/reproduce language); helpers ink* -> sign*
- portal-esign-generic.ts: API field ink_reproduction -> require_sign_consent,
  ink_consent_text -> sign_consent_text, request field ink_consent -> sign_consent
- signing page (site/public/portal/esign): all ids/vars/comments ink* -> sign*;
  no 'ink' string remains
- npi_provider metadata flag ink_reproduction -> require_sign_consent
- migration 090/092 + live DB column comments rewritten to drop ink/plotter
  wording (DB column names kept as ink_consent* for compat, internal only)
- order-timeline.ts buffer comments neutralized
- tests: 37 checks, consent text asserted to omit ink/plotter/paper/reproduce/etc

DB columns ink_consent* retained (internal, never sent to clients) to avoid a
risky rename of already-applied prod columns.
2026-06-07 05:06:26 -05:00
justin
a4bad723bc esign: ink-reproduction consent gate + patent-risk research
Consent gate (the legal linchpin from the wet-signature memo):
- migration 092 adds ink_consent/ink_consent_at/ink_consent_text to esign_records
- extract pure, unit-tested gate logic into esign-ink-consent.ts (DRY single
  source for route + signing page): isInkReproduction / inkConsentRequired /
  inkConsentSatisfied + verbatim client-safe INK_CONSENT_TEXT
- portal-esign-generic.ts: GET surfaces ink_reproduction + consent text; POST
  gates DRAWN signatures on ink-path docs on explicit consent, stores it
- signing page locks the signature block until consent is checked (drawn only)
- npi_provider marks cms855/cms10114 esign metadata ink_reproduction=true
- 33 unit checks: gate truth table + consent text omits all internal mechanics
  (plotter/machine/CMS/MAC/etc) and keeps required legal reassurances

Patent-risk memo (docs/legal/patent-risk-mechanical-wet-signature.md):
- prior-art-dated risk analysis (autopen 1803/1942, plotters, CNC = public domain
  => low risk on core concept; e-sign workflow space litigious)
- firsthand recent-grant sweep (1.58M USPTO grants 2021-2025, queried via DuckDB):
  ZERO patents on machine-applies-signature-in-ink; e-sign players hold only
  electronic-workflow patents. Not an FTO; flags where attorney search is needed
2026-06-07 04:44:11 -05:00
justin
894d989445 Add portable Line-us pen-arm support to ink-signature pipeline
Adds a second machine class (small fan-shaped reach arm) alongside the
CR-10/AxiDraw rectangular-bed plotters, so wet signatures can be produced
while away from the home station.

ink_signature_plotter.py:
- PlotterConfig gains dialect (marlin|lineus) + name; new LineUsConfig
  (native units, pen height = per-move Z, reach annulus from shoulder pivot).
- Named machine profiles (cr10 default, axidraw, lineus) via load_profile().
- bed_mm_to_lineus_units(), check_reach() (annulus for lineus, rectangle for
  marlin), compute_jig_offset_for_box() (solves jig from the ACTUAL fitted ink
  extent so a wide cell line doesn't over-constrain a small arm).
- emit_gcode() dispatches to emit_marlin_gcode()/emit_lineus_gcode().
- send_lineus(): WiFi TCP 1337 (NUL-terminated, ok-acked) or USB serial,
  dry_run=True default (same gating as the CR-10 path).

ink_signature_cli.py: --profile, --solve-jig (auto-applies jig offset),
--lineus-host/--lineus-usb, reach-check that refuses to --plot out-of-reach
on Line-us.

Tests: 43 checks (was 30) covering profiles, reach check, jig solve, lineus
emitter, dry-run sender. Docs updated with profiles + portable workflow.
2026-06-07 03:45:46 -05:00
justin
28b1af341d Wire fulfillment alerts to Telegram + surface order progress in portal + even out ERPNext sync
Telegram notifications:
- Add shared scripts/workers/telegram_notify.py (send_telegram, notify_fulfillment_todo,
  create_admin_todo) so every worker alerts the operator the same way; fire-and-forget.
- Fire notify_fulfillment_todo after each admin_todos insert across all 8 service
  handlers (9 sites) so no fulfillment task waits unseen.
  (Orders + quotes + tickets already notified via checkout/quotes/tickets routes.)

Client portal order progress:
- order-timeline: derive real per-step status from live signals (payment paid,
  e-signature signed, fulfillment_status) instead of a static template; add
  current_step to the response.
- Extract pure applyLiveStatus into order-timeline-status.ts (DB-free) + unit test
  (api/test/test_timeline_status.ts, 8 cases).
- portal /me now returns compliance_orders.fulfillment_status.
- Dashboard renders a client-safe Progress badge (In progress / Action needed /
  Filed-awaiting-confirmation / Completed); batches show the most actionable status.
  No back-office mechanics exposed.

ERPNext sync parity:
- Create a Sales Order for formation and fcc_carrier_registration orders (previously
  only canada_crtc + compliance synced); write erpnext_sales_order back to each table.
  Non-blocking, matches existing pattern.

Verified: API tsc clean, timeline unit tests 8/8, Astro build 58 pages,
cms10114/ink/paper_batch Python tests still green, no mechanics leaks.
2026-06-07 03:17:46 -05:00
justin
b0a8563a93 ink-signature: pen-plotter pipeline for original wet-ink CMS signatures
The Standard no-login CMS path needs an ORIGINAL ink signature on paper
(CMS-10114: 'Stamped, faxed or copied signatures will not be accepted'). This
adds a pipeline to redraw the provider's own captured strokes in real ink with a
pen on a CR-10 V2 (or any Marlin/GRBL machine) — original, in ink, never copied.

- migration 090: esign_records.signature_vector (JSONB stroke paths, 0..1).
- signing page now captures normalized stroke paths alongside the PNG; API
  stores a size-bounded vector for drawn signatures.
- ink_signature_plotter.py (hardware-independent): fit strokes to the signature
  anchor box, PDF-pt -> bed-mm via jig offset, emit Marlin/GRBL G-code (Z pen or
  M280 servo/BLTouch), SVG toolpath preview, and render_signature_on_pdf (a
  digital twin that proves the toolpath lands on the cert line). Gated serial
  sender (dry_run default).
- ink_signature_cli.py: end-to-end load-record -> gcode+preview, --test-box jig
  calibration, --plot to stream over USB.
- Corrected CMS-10114 signature anchor to sit inside the Section 4A signing cell
  (above the bottom rule, below the label).
- docs/ink-signature-plotter.md documents the CR-10 retrofit + interpretive risk.

Tests: test_ink_signature.py 30/30, test_cms10114.py 27/27, test_paper_batch.py
15/15, API tsc clean, Astro build 58 pages.
2026-06-07 02:34:17 -05:00
justin
e6a630ada1 healthcare: verify CMS-10114 update path, correct NPI Enumerator address, build CMS-10114 filler
Verified firsthand against the live CMS-10114 (Rev. 02/25, OMB 0938-0931):
- Section 1A confirms paper is valid for Change of Information (#2) AND
  Reactivation (#4), not just initial enumeration. Resolves the UNCERTAIN flag.
- Current mailing address is CMS NPI Enumerator Services, Mail Stop DO-01-51,
  7500 Security Blvd, Baltimore MD 21244. The old Fargo PO Box 6059 is retired;
  corrected in mac_routing.NPI_ENUMERATOR + all docs.
- No electronic no-login equivalent exists for CMS (NPI Registry API is
  read-only; PECOS/NPPES-IA require login), unlike FMCSA's ask.fmcsa ticket form.
  So tiers stay: Standard=paper CMS-10114 (no login), Expedited=NPPES surrogate.

New: cms10114_pdf_filler.py fills the flat official form via text overlay
(reason checkbox + NPI + Section 2A identity + Section 4A cert name + signature
anchor); wired into npi_provider._generate_10114_for_signing for nppes-update.
Signed forms route to the NPI Enumerator via the existing daily batch.

Tests: test_cms10114.py 27/27, test_paper_batch.py 15/15, Astro build 58 pages.
2026-06-07 02:04:41 -05:00
justin
7ea18dd3d8 healthcare: optional surrogate-access intake question (expedited path)
- NpiIntakeStep: add positively-framed 'can you grant electronic I&A Surrogate
  access?' question for all filing slugs (reval/reactivation/nppes-update/
  enrollment/bundle). Optional, never required, never mentions paper; captured
  as intake_data.surrogate_access (yes/no/blank). Astro build green (58 pages).
- npi_provider.py: surface the surrogate answer in the admin todo so fulfillment
  knows EXPEDITED (online via surrogate) vs STANDARD (e-sign + daily mail batch).
2026-06-07 00:33:33 -05:00
justin
138fec17e9 healthcare: daily batched paper-filing fulfillment
Standard (no-login) CMS filings are mailed in one Priority Mail envelope per
destination agency, batched each postal working-day morning to save postage.

- migration 089: paper_filing_batches table + esign_records.paper_batch_id /
  filing_destination_key (idempotent: a filing is batched at most once).
- batch_cover_sheet.py: per-agency cover sheet (sender/dest/date/manifest) +
  merged print-job PDF (cover + all enclosed signed filings).
- daily_paper_batch.py worker: gather signed+unbatched cms855/cms10114 filings,
  group by destination (MAC by state via mac_routing; Fargo for CMS-10114),
  build cover+merged PDF per agency, persist batch, mark filings batched.
  Self-gates on postal working days (skips weekends + federal/USPS holidays).
  Phase 1 = human prints+mails; phase 2 = wire print-mail API.
- worker-crons: pw-paper-batch systemd timer (Mon-Fri 13:30 UTC, self-gated).
- test_paper_batch.py: 15/15 pass (working-day gating, routing, cover+merge).
2026-06-07 00:30:01 -05:00