Commit graph

211 commits

Author SHA1 Message Date
justin
a1db921c71 mcs150/workers: don't fill MCS-150 for non-form services; quiet ERPNext workflow advance
- MCS150UpdateHandler is the catch-all for many admin-assisted DOT services
  (UCR, MC authority, audit prep, ETA, name reservation, registered agent,
  annual report). It was filling an MCS-150 PDF for ALL of them -- e.g. a UCR
  order produced a wrong MCS-150 PDF. Now only MCS150_FORM_SLUGS fill the form;
  others get an admin-review todo (PDF 'not generated') for manual handling.
  Signature flow was already correctly scoped (UCR is not in DOT_SIGNING).
- handle_process_compliance_service forced the Sales Order workflow_state to
  'Review' via set_value, which bypasses ERPNext's allowed transitions and
  threw WorkflowPermissionError (Received -> Review) on every run. The Postgres
  fulfillment_status is the source of truth; the ERPNext workflow_state is a
  cosmetic mirror. Now try the proper apply_workflow action and stay quiet
  (debug, not warning) when no valid Review transition exists.
2026-06-10 17:22:38 -05:00
justin
a04146da2b crtc: remove Canadian accountant/accounting-setup service (no longer offered)
We no longer offer Canadian accountant/accounting setup. Removed all
service-offering content:
- Marketing page (services/telecom/canada-crtc): the 'Set Up Canadian
  Accounting (we help)' next-steps card, the '3 hours of complimentary
  accounting consultation' deliverable bullet, and the whole 'Accounting
  Support' section (assigned accountant, portal chat, $75/hr, 3 complimentary
  hours).
- Order page (order/canada-crtc): the '3 hrs Canadian accounting support'
  included-feature bullet and the 'Preferred accounting software'
  (Xero/QuickBooks) form field + its accounting-hours helper text.
- Fulfillment (canada_crtc.py): dropped the bank-setup email line offering
  '3 hours of Canadian accounting consultation'.

Kept factual GST/HST tax advisories and the bank's QuickBooks/Xero
transaction-sync feature (third-party bank capability, not our service).
2026-06-10 16:51:33 -05:00
justin
7d8a08d9d3 mcs150: scope intake-completion email to actual MCS-150-form services
MCS150UpdateHandler is the catch-all for many admin-assisted DOT services
(UCR, MC authority, audit prep, ETA, name reservation, registered agent,
annual report). My new intake-completeness gate was firing the 'confirm your
MCS-150 details' email for ALL of them -- e.g. a UCR order wrongly emailed the
customer about MCS-150 details. Scope the gate to MCS150_FORM_SLUGS (the
services that actually file an MCS-150: mcs150-update, dot-registration,
usdot-reactivation, dot-full-compliance).
2026-06-10 14:52:36 -05:00
justin
1ff8b88ac8 fix: stop suppressing synthetic@pipeline.com (real customer address)
Paul Wilson (Compound Technologies) signed up with synthetic@pipeline.com,
which is a genuine, deliverable EarthLink address (pipeline.com MX ->
earthlink-vadesecure.net; he confirmed receipt by phone). Our code had
hardcoded pipeline.com + the synthetic@ prefix as a 'non-deliverable
FMCSA-census placeholder' and silently suppressed every automated email to
him (checkout provisioning, order-creation validation, intake reminders,
set-password invites). Nothing in the codebase actually generates that
address, so the placeholder rationale was wrong. Removed pipeline.com and the
synthetic@ rule from all four suppression sites; only RFC-reserved
example.com/test.com/invalid remain blocked.
2026-06-10 14:41:19 -05:00
justin
a3aeedd716 mcs150: census-prefilled intake-completion flow + completeness gate
Closes the data gap for orders that bypass the full intake (e.g. the DOT
compliance-remediation pipeline) and for all MCS-150 variants:

- Worker intake-completeness gate (mcs150_update): before filling, check the
  customer-required operational fields the FMCSA census cannot supply
  (operation classification, cargo, CURRENT annual mileage, email; plus
  signer/address for new-registration/reactivation, and states-of-operation
  for 150B hazmat). If missing, email the customer a census-pre-filled intake
  link and hold the order at fulfillment_status='awaiting_intake' with an admin
  todo, instead of fabricating a blank filing. The existing intake PUT endpoint
  already re-dispatches the worker on submit, so filing auto-resumes.
- Intake wizard (Wizard.astro): when resuming ?order=CO-xxx for a DOT/MCS order,
  seed still-empty fields from the FMCSA census (name/address/fleet/interstate)
  so the customer only confirms the operational details.
- /api/v1/dot/census now also returns total_drivers + a normalized
  carrier_operation_code for the prefill.
- MCS150Step.astro extended to collect every field the filler needs across all
  variants: mailing address, cdl_drivers, primary_vehicle_type,
  reason_for_filing, usdot_revoked, cell/fax, hazmat-safety-permit block
  (needs_hmsp, operating states, security plan), and intermodal-equipment
  provider counts; all prefill from intake_data.

verify_mcs150_variants.py covers 150/150B/150C end-to-end (ALL PASS).
2026-06-10 14:03:28 -05:00
justin
38739e023c mcs150: complete all-variant field mapping (150/150B/150C)
Adds the previously-unmapped fields so every variant fills fully:
- Q25 hazmat C/S/B/NB matrix (HAZMAT_ROW_MAP x HAZMAT_COL_MAP, 156 boxes)
- MCS-150B states-of-operation checkboxes (full name or 2-letter code), HMSP
  Hazard/Permit/Security radios, and accident count (32accidentNumber)
- MCS-150C intermodal equipment counts (20owned/leased/serviced) + correct
  field renumbering (17dunbrad/18irs/19eMail) + USDOT Button + named-export
  Reason/Mailing radios
- Structured fleet via intake['vehicles'] = {vehicle_type: {owned, term_leased,
  trip_leased}} across all Q26 vehicle rows; non-CMV count; cell/fax; second
  officer
- _set_button now resolves a candidate tuple against each field's actual export
  states, so numeric (/0../4) and named (/Yes,/Biennial...) radios both work

verify_mcs150_variants.py exercises all three variants end-to-end: ALL PASS.
2026-06-10 13:55:55 -05:00
justin
96f31e7c31 mcs150: only check Q29 passenger-cert box for passenger carriers
certifyBox is the Q29 Passenger Carrier Compliance Certification YES box
(page 3, y=530), not a general perjury checkbox. It was being checked
unconditionally, which wrongly marked freight/property carriers as passenger
carriers. Now only check it when the carrier is a passenger carrier; the
Q31 perjury declaration is made via the signature.
2026-06-10 13:44:34 -05:00
justin
b95ee04752 mcs150: fill all checkboxes/radios correctly + stamp explicit checkmarks
Fixes a batch of missing fields the FMCSA census does not provide and the
filler was mis-mapping:

- Corrected the question->field mapping to match the actual form: Q22 =
  COMPANY OPERATIONS (interstate/intrastate, 22xBox), Q23 = OPERATION
  CLASSIFICATIONS (for-hire/private/govt, 23xBox). These were swapped, and
  the bogus entity-type->23xBox map (no entity-type question exists on this
  form revision) was removed.
- Added proper radio-group handling for Reason for Filing (Biennial Update),
  Mailing-address (Same as principal vs below), and Q28 USDOT-revoked, with
  correct option indices (these are /0../n radios, not /Yes checkboxes; the
  old code set them to /Yes and never selected the right option).
- Map interstate/intrastate from the FMCSA census carrierOperationCode, and
  populate email/phone/mileage/cargo from intake.
- AcroForm checkbox/radio appearances use a ZapfDingbats glyph that
  poppler/Preview fail to render (value set but box looks empty). Now stamp
  an explicit X overlay into the page content for every 'on' box so it shows
  in every viewer and in the faxed output.
2026-06-10 13:41:48 -05:00
justin
386467bedf mcs150: trim FMCSA instruction pages from form templates
The official MCS-150/150B/150C PDFs ship with 8 (150/150B) or 4 (150C)
FMCSA instruction/example pages before the actual fillable form. We were
generating + faxing/submitting all of them. Trimmed the source templates
down to the FORM pages only:
  MCS-150  11 -> 3 pages (289 fields preserved)
  MCS-150B 12 -> 4 pages (349 fields preserved)
  MCS-150C  6 -> 2 pages (33 fields preserved)

The filler iterates writer.pages (no absolute index) and signature
anchors are derived dynamically via enumerate(reader.pages), so no
page-specific markup needed fixing. Removed one-off diag script.
2026-06-10 13:25:07 -05:00
justin
4447905864 workers: don't re-upload handler-returned MinIO keys as local files
handle_process_compliance_service assumed handlers return local temp
paths and re-uploaded each to MinIO. The MCS-150 handler uploads itself
and returns the MinIO key, so the re-upload tried to read a nonexistent
local file and logged a 'File not found' error after the order was
already correctly held at the admin gate. Now we skip files that don't
exist locally and keep the returned key as-is.
2026-06-10 12:47:16 -05:00
justin
71404810c4 mcs150: backfill signer name from signed cert + re-stamp signed form
- When intake lacks signer_name, backfill it from the name the client
  typed when signing the perjury certification (that name is exactly what
  belongs in the form's print/type-name field, certifyName).
- After a client-approved re-dispatch, re-point the signed esign record at
  the freshly filled form and re-stamp the signature, so the signed PDF an
  admin reviews reflects the current complete form (not a stale earlier
  fill). Field layout (and thus signature anchors) is unchanged across
  fills, so the recorded anchor coordinates stay valid.
2026-06-10 12:41:27 -05:00
justin
bd0d8c0e2c verify_mcs150: correct handler class name 2026-06-10 12:39:41 -05:00
justin
79c460ef25 mcs150: render verification harness + auto-generate appearance streams
- fill_mcs150 now uses auto_regenerate=True so pypdf writes appearance
  streams for every text field. Preview/Chrome ignore /NeedAppearances and
  were showing blank widgets over the values; generated /AP streams make
  the text render in all viewers.
- New verify_mcs150.py reads each widget's /AP /N appearance stream (the
  literal drawn glyphs) to confirm expected values actually render, since
  the container has no OCR/raster tooling. Exits non-zero on any miss.
2026-06-10 12:37:25 -05:00
justin
d5e66786a2 mcs150: enrich intake from FMCSA carrier census before PDF fill
The MCS-150 biennial update re-confirms the carrier's existing FMCSA
record. Previously the PDF filler only had whatever the intake form
collected; rescued/sparse orders (or orders where the carrier's data
lives in FMCSA, not the intake) produced near-empty forms. Now we pull
the carrier census (legal name, address, EIN, fleet counts) from the
FMCSA carrier API and merge it under any customer-provided intake values
(customer edits win), so the form is pre-filled with the carrier's
current registered data. Refactored the FMCSA fetch into a shared
_fetch_fmcsa_carrier helper used by both enrichment and status check.
2026-06-10 12:35:52 -05:00
justin
7e5946d65a fix(mcs150 pdf): set /NeedAppearances so filled field values actually render
Customer saw the MCS-150 looking blank / 'data covered by the form fields': the
values were correctly written to the AcroForm /V, but pypdf left the template's
empty /AP appearance streams in place and NeedAppearances was false, so viewers
rendered the blank widget over the value. Setting AcroForm /NeedAppearances=true
makes viewers regenerate appearances from the values. (The missing signature was
a downstream effect of the separate fobj_put MinIO-upload bug, now fixed -- with
no PDF in MinIO the anchor extraction + signature stamping both failed.)
2026-06-10 12:20:45 -05:00
justin
b28dda7c5a feat(telegram): include a presigned PDF view link in the admin-review alert
When an MCS-150/USDOT order hits the pre-submission admin-verification gate, the
Telegram FULFILLMENT NEEDED alert now appends a presigned link to the prepared
PDF (via the public minio.performancewest.net endpoint, IP-allowlisted to admin)
so you can review the document straight from the alert before approving. Added
notify_fulfillment_todo(view_url=...) + a _presigned_view_url helper (public
endpoint + explicit region to avoid the region-probe that 403s from the worker).
2026-06-10 12:13:43 -05:00
justin
09d928a582 fix(mcs150): MinIO upload used wrong method fobj_put -> fput_object
The MCS-150/USDOT PDF was generated fine but the MinIO upload threw 'Minio object
has no attribute fobj_put' (wrong method name + signature), so the prepared filing
PDF was never persisted -- nothing for an admin to review at the verification gate,
and the esign-completed re-dispatch failed with 'File not found'. Use the correct
minio fput_object(bucket, key, file_path). Affects every MCS-150/USDOT filing.
2026-06-10 12:08:12 -05:00
justin
058d4d426a feat(compliance): admin verification gate + durable submission evidence
Per request: after the customer signs but BEFORE we submit to the government, hold
the order for a human to verify the prepared filing is correct.

- MCS-150 handler (mcs150-update + usdot-reactivation): new admin-verification gate
  after the signature gate -- if not admin_approved, set fulfillment_status=
  'ready_to_file', create a HIGH-priority 'VERIFY before filing' admin todo, and
  STOP (no FMCSA submission). job_server injects admin_approved from the dispatch
  payload (mirrors client_approved).
- New admin endpoint POST /api/v1/admin/compliance-orders/:id/approve-submit
  (requireAdmin): verifies status=ready_to_file, re-dispatches the worker with
  admin_approved=true to proceed to actual submission.
- Durable submission EVIDENCE: the web/fax submitters only wrote confirmation
  screenshots to an ephemeral temp dir. Now _upload_submission_evidence copies the
  FMCSA confirmation screenshot + attested PDF + fax_log_id to MinIO under
  filings/<slug>/<order>/evidence/ and records the keys on the order, so we keep
  proof of every government submission.

(state-trucking + the FCC handlers already gate via admin todos / auto_filing.py;
this brings MCS-150 to parity and adds evidence retention.)
2026-06-09 22:50:30 -05:00
justin
e87715aee7 fix(portal): onboarding/login links last 7 days, not 60 min
The rescue onboarding emails hardcoded a 60-minute expiry -- way too short for a
paid customer who hasn't engaged yet (they may not check email for hours/days),
so Paul's and Mitchell's links expired before they used them. Onboarding links
now last 7 days (ONBOARDING_TTL_MINUTES); the standard security password-RESET
window bumped 30min -> 2h. Re-issued fresh 7-day links to all 3 affected
customers (none had set a password yet) via reissue-onboarding-links.mjs, cc'd.
2026-06-09 22:50:09 -05:00
justin
a6d2f10149 chore: Mark Adams rescue (3rd real customer stuck on the login bug)
Paid Jun 1 for MCS-150 (card), no customers row -> couldn't log in -> no intake ->
filing stuck 'NEEDS MANUAL FILING'. Created his customers row + sent login +
intake email (cc justin). All 3 real paying customers now rescued; the underlying
card/PayPal login bug is fixed so new orders won't hit this.
2026-06-09 22:34:52 -05:00
justin
1c2e263bb7 warmup(ip-rehab): bias recipients to multi-subscriber business domains (cut bounce)
Day-0 batch saw ~45% bounce because 'no listmonk bounce record' is a weak
liveness signal. Now require the recipient's domain to have >=2 enabled
subscribers (a real org, not a one-off typo'd address), which materially lowers
the dead-mailbox bounce rate on the recovering IPs.
2026-06-09 20:31:45 -05:00
justin
25f4a7503b warmup: IP rehab for .91-.93 so they can be reallocated
The 3 IPs (mta02-04 / .91-.93) retired after the May 30-31 over-volume blast are
NOT on any DNSBL (Spamhaus/Barracuda/SpamCop/SORBS all clean) and have clean PTRs
+ SPF/DKIM/DMARC -- the damage was provider-internal reputation, which recovers
with slow clean sending. scripts/ip_rehab.py sends a tiny ramping trickle
(10/IP/day -> cap 60) of genuine CAN-SPAM-compliant compliance check-in mail to
clean business-domain, never-bounced recipients via dedicated heavily-throttled
postfix transports rehab02/03/04 (30s/msg, bound to .91/.92/.93). Routing uses an
X-PW-Rehab-IP header + header_checks FILTER to override the transport_maps randmap
warmup rotation (verified: mail routes via rehab transports, status=sent). Daily
cron pw-ip-rehab. After ~2-3 weeks of clean sending the IPs can be reallocated.
2026-06-09 20:27:47 -05:00
justin
6f361d307d warmup: add Microsoft consumer (hotmail/outlook/live/msn) to cold-mail exclusions
Main-pool delivery was stuck ~60-67% (897+ daily spam-blocks). Audit of the main
listmonk found ~47k consumer mailboxes still enabled in OLD cold trucking lists
(built before the exclusion existed): 19k gmail, 27.5k yahoo-family + Microsoft
consumer. gmail alone was 101 of today's 909 550-5.7.1 blocks. Blocklisted all
~47k consumer addresses NOT in the protected FCC lists 3/6 (per the keep-gmails
decision for those warm lists). Also added Microsoft consumer domains to
BLOCKED_EMAIL_DOMAINS so the daily trucking builder stops re-adding them
(Microsoft SmartScreen silently junks/defers cold B2B mail -- a reputation drag
even though it doesn't 550 like Google). Enabled base 95455 -> 48707 (real
business/ISP domains).
2026-06-09 20:10:54 -05:00
justin
3dbf5b3bb2 chore: Mitchell Allen rescue scripts (customers row, SO backfill, re-dispatch, login+signature email) 2026-06-09 14:58:52 -05:00
justin
e2467752dc chore: export ensureComplianceSalesOrder for rescue/backfill use 2026-06-09 14:44:28 -05:00
justin
68e6b60951 fix: worker emails (localhost:25 -> SMTP relay) + create ERPNext SO on webhook payment
Two bugs found tracing Mitchell Allen's batch CB-95BA6C90 (5 DOT services, card):

1) Worker authorization/signing-link/status emails were sent via
   smtplib.SMTP('localhost', 25), which has no MTA in the workers container ->
   every send failed '[Errno 111] Connection refused', so customers never got
   their e-sign links and orders sat 'awaiting client signature' forever. Routed
   all 9 hardcoded localhost:25 sites (state_trucking, mcs150_update, boc3_filing,
   hazmat_phmsa, mailbox_setup, dot_esign, completion_emails) through the
   authenticated SMTP relay (SMTP_HOST/PORT/STARTTLS/login) + added a shared
   worker_email.send_worker_email helper.

2) The ERPNext Sales Order for compliance/compliance_batch was only created in
   the /checkout/create-session endpoint, but CARD orders confirm via the Stripe
   WEBHOOK -> handlePaymentComplete, which never created the SO. Result: every
   webhook-confirmed order had erpnext_sales_order=NULL and workers logged
   'Sales Order not found 404' then built from PG. Added idempotent
   ensureComplianceSalesOrder() to handlePaymentComplete so ALL payment methods
   (card-webhook, PayPal, crypto) create + link the SO.
2026-06-09 14:40:46 -05:00
justin
220f301453 test(e2e): fix compliance_orders seed columns (no total_cents); regression PASS
e2e-paypal-portal-fix.mjs now passes against live prod: completing a compliance
order creates the customers row (id, name=E2E Tester, company from intake_data,
no password) -> customer can register/reset + log in. PayPal login bug fixed.
2026-06-09 14:35:04 -05:00
justin
3c65dd8748 fix(checkout): pull company from intake_data (compliance has no customer_company col)
compliance_orders stores company in intake_data JSON, not a column; read it from
there (company/legal_name/entity_name) with graceful fallback. Fix e2e test seed
accordingly.
2026-06-09 14:31:42 -05:00
justin
9987b1e30d fix(checkout): create Postgres customers row on order completion (PayPal login bug)
Portal login + forgot-password read the Postgres customers table (bcrypt), NOT
ERPNext. ensureCompliancePortalUser (the common path for Stripe/PayPal/crypto via
handlePaymentComplete) only provisioned the ERPNext customer/website-user and
never created the customers row -- so customers (notably PayPal, who reach this
path directly) had no account to log into or reset a password against. Now upserts
the customers row (no password; ON CONFLICT keeps any existing hash) with name +
company so they can register/reset and log in immediately.

Also: narrowed the placeholder-email skip from 'any synthetic@ or pipeline.com' to
exactly 'synthetic@pipeline.com' (the FMCSA-census placeholder) so real customers
on those real consumer domains aren't wrongly skipped -- which is what bit Paul
Wilson. Added cc support to sendEmail. e2e-paypal-portal-fix.mjs is the regression
test (seeds a compliance order, runs handlePaymentComplete, asserts the customers
row is created). Rescue scripts for the affected customer included.
2026-06-09 14:28:19 -05:00
justin
20c11e6180 fix(formation/NV): name search returns unknown (admin-verify), not a fake result
Nevada's entity search (esos.nv.gov/SilverFlume) is behind Imperva Incapsula bot
protection that blocks headless + the residential proxy IPs, and NV has no public
open-data API/bulk dataset. The old adapter scraped the blocked challenge page and
returned available=False for everything (would tell customers a free name is taken)
and also had a NameError bug (f"${CODE}..."). Now NV detects the Incapsula
challenge and returns available=None ('could not determine') with a note to verify
manually on SilverFlume -- never a false 'taken', so it never wrongly blocks an
order. TX remains fully automated via the open-data API.
2026-06-09 08:41:22 -05:00
justin
f94ad1682b fix(formation/TX): name search via Texas open-data API, not scraping
The TX Comptroller web search is now a JS form (old input#entityName selector
dead) and SOSDirect is login-gated, so the scraper returned garbage. Replaced
search_name with the Texas Socrata 'Active Franchise Taxpayers' dataset
(data.texas.gov/resource/9cir-efmm.json) over SoQL -- free, no-auth, no-login,
no bot-blocks. Exact normalized match => unavailable; no rows => available; API
error => available=None (never a false 'taken'). Verified: unique name = 0 rows
(available), 'APPLE INC.' = exact match (taken).
2026-06-09 08:34:37 -05:00
justin
76c4d55603 fix(formation): name-search returns null (not false) on adapter error
E2E harness exposed that the NV name-search adapter times out on a stale input
selector and search_name() swallowed the error and returned available=False --
i.e. it would tell customers an AVAILABLE name is taken. Now returns
available=None ('could not determine') on adapter error / unknown state, which the
API already maps to null. The NV/TX portal selectors still need a scraping fix
(separate task; e2e harness is the acceptance test) before enabling a self-serve
formation checkout. Documented full e2e results + the bugs caught (missing ERPNext
Items, entity_type case, missing /name-search route) in the readiness doc.
2026-06-09 08:06:43 -05:00
justin
4c0decd175 fix(formation): add working /name-search worker route + e2e harness
Two latent bugs the e2e harness caught:
1. api entities.ts GET /states/:code/name-search calls WORKER_URL/name-search,
   but job_server had NO such route -> 404 -> silently fell back to stale
   entity_cache on every live name check. Added a synchronous /name-search route
   returning {available,exact_match,similar_names,state}.
2. both the new route AND the existing async handle_name_search imported a
   nonexistent search_name_sync(); fixed to drive the real async search_name()
   via an event loop (same pattern as /entity-status).

scripts/e2e-formation-order.mjs: replays the real formation order flow (live
name search -> formation_orders insert -> ERPNext customer + Sales Order with
BUSINESS-FORMATION + STATE-FILING-FEE line items -> verify SO total + DB linkage
-> cleanup) without a real Stripe charge or state filing. Run in the api container.
Also created the missing ERPNext Items (BUSINESS-FORMATION, STATE-FILING-FEE,
FOREIGN-QUAL-SINGLE/MULTI) that the formation SO references.
2026-06-09 07:51:54 -05:00
justin
37393e5bbc scripts(otc): dedupe by CIK; commit the 861-company lead list
The master file lists warrants/units as separate tickers under one CIK, so the
pull now dedupes to one row per company (other tickers kept in all_tickers).

data/otc_leads.csv: 861 unique active US-domestic microcap OTC issuers
(<$75M float, all actively filing, 100% with business address + phone). By
incorporation: DE 365, NV 325 (DE+NV=690 = the reincorporation targets), WY 44,
FL 39, MD 38. Dropped from the 2,771 OTC universe: 1,672 foreign, 62
accelerated/large filers, 73 delinquent/dark. EDGAR has no email -> phone +
address captured for enrichment / direct mail / call.
2026-06-09 07:10:54 -05:00
justin
1b3cbf2fbf scripts(otc): SEC EDGAR lead-pull + reincorporation-destination research
scripts/otc_lead_pull.py: pulls company_tickers_exchange.json + per-issuer
submissions/CIK*.json from SEC EDGAR (10 req/sec, declared User-Agent), filters
to active US-domestic microcaps (drops foreign ADRs, accelerated/large filers
that keep counsel on retainer, and delinquent/dark shells), writes a CSV with
state of incorporation, business/mailing address, phone, SIC, filer size bucket,
last filing date. DE/NV prioritized.

docs 4c: where companies actually reincorporate TO -- Nevada #1 (281 filings),
Texas the fast riser (99 all-time, but 27 vs NV 33 since 2024), Florida modest
(23), Wyoming niche (8). Lead with 'leaving Delaware?' and let client pick
NV/TX/FL; same flat-fee conversion productizes across all three.
2026-06-09 06:58:01 -05:00
justin
90bccfda32 fix(checkout): route dot-new-carrier-bundle on success page + worker pipeline
Follow-on to the trucking new-carrier slug fix:
- success page: add dot-new-carrier-bundle to DOT_SLUGS + NEW_CARRIER_SLUGS so
  the order-confirmation 'what to expect' messaging classifies it as trucking.
- pipeline_orchestrator: the trucking onboarding PIPELINE was keyed under the
  bare 'new-carrier-bundle' slug, which is the TELECOM bundle's slug (also a
  collision at the worker layer). Re-keyed to 'dot-new-carrier-bundle' so a
  trucking bundle never runs the telecom pipeline (and vice versa).
2026-06-08 23:48:56 -05:00
justin
09b32d6e75 fix(bounce-watchers): QID regex broke on out05-09/hcout transports
Both Postfix bounce watchers extracted the queue-ID with regex
postfix/[a-z]*[ (main) and postfix/[a-z0-9]*[ (hc), which only matched simple
transports like qmgr/cleanup. When the MTA started rotating sends through the
numbered warmup transports (out05-09/smtp, hcout1-3/smtp -- a transport/subprocess
name WITH a slash), the QID extraction returned empty, so status=bounced lines
never matched a tracked campaign QID and NO bounces were posted to listmonk since
~May 31. Result: dead mailboxes never got blocklisted and kept being retried,
dragging the warmup bounce rate.

Fix: regex postfix/[^[]*[ matches any transport/subprocess name up to the [pid].
Verified live on both watchers (out07/smtp and hcout1/smtp test bounces now
detected + posted).

Separately (server-side, not in repo): listmonk bounce.actions was null (no
auto-action), so even posted bounces did nothing. Set hard=1->blocklist,
complaint=1->blocklist; verified a real bounce now records + blocklists.
2026-06-08 15:02:32 -05:00
justin
b973c6c132 exclusions: block Google consumer mailboxes (gmail) from cold/warmup sends
Warmup audit (2026-06-08) found the main sending pool was eating a 37% bounce
rate, and 556 of those were Google 550-5.7.1 'likely unsolicited mail' spam
blocks -- of which gmail.com alone was 427 (77%). Google's cold-IP filter is the
strictest of the big providers and consumer gmail has the highest complaint
sensitivity, so mailing it from a warming IP is pure reputation damage.

Added GOOGLE_CONSUMER_DOMAINS (gmail.com, googlemail.com) to BLOCKED_EMAIL_DOMAINS,
which the daily trucking builder already enforces in its recipient SQL
(lower(domain) <> ALL(blocked)). Takes effect on the next nightly build.

Custom domains silently on Google Workspace are a smaller (~5%) MX-only signal,
already handled in the healthcare builder via the mx_provider flag; can be ported
to the main pool later if the residual warrants it.
2026-06-08 13:50:27 -05:00
justin
a78d60a127 hc: auto-reply for 'already revalidated' replies + permanent suppression
A lead replied with proof their Medicare revalidation was already approved (CMS
data-lag: the public Revalidation Due Date List still showed them overdue weeks
after approval). Two of these arrived same-day, so:

- Carbonio auto-reply (deployed on co.carrierone.com): created mailbox
  hc-replies@ on the info@ distribution list with a Sieve that auto-acknowledges
  'my revalidation is already complete' replies (tag + mark read + file into a
  'Reval Completed (auto-acked)' folder + on-brand reply explaining the CMS lag).
  CRITICAL: info@ is the shared reply-to for ALL campaigns (healthcare, trucking,
  telecom), so every rule is anchored to Medicare/revalidation context -- a
  trucking 'MCS-150 done, this is bogus' or telecom 'RMD done' reply does NOT
  trigger it (tested + passing). A buyer guard ('please file / how much') also
  suppresses the auto-reply so a human handles the sale.
  Carbonio 25.x Sieve quirks documented (vacation/imap4flags/body :text all
  unsupported; use reply/flag/tag/body :contains).

- Permanent suppression: new data/hc_suppress.txt do-not-contact list the warmup
  honors at import AND --prune removes from the live lists. Seeded with the two
  completed providers (Pangea Lab, Yakima Valley FWC); both also blocklisted in
  listmonk_hc and removed from lists 3 + 4.
2026-06-08 10:37:49 -05:00
justin
9cb10b18e0 feat(hc): deliverability prune -- evict newly-Google-hosted subscribers
Belt-and-suspenders for the edge you flagged: a domain already in a warmup list
could flip its MX to Google Workspace between weekly refreshes, after which it
would hard-bounce from the cold IP. The import-time guard only catches NEW adds.

- prune_holdouts(): enumerates each warmup list's subscribers, matches them
  against the FRESH master CSV (re-classified weekly), and removes any whose
  domain is now Google-hosted. DELIVERABILITY-ONLY -- it never evicts for
  audience reasons (an overdue provider drifting out of the 1-90 day window was
  a valid target when warmed; re-litigating that just wastes warmup progress).
- --prune (run alongside warming) and --prune-only (prune then exit).
- Wired into the weekly refresh cron as a --prune-only chained step, so MX is
  re-checked and holdouts removed every Monday before the weekday sends.

Verified end-to-end: with no Google domains in lists it's a 0-op; injecting a
simulated Google-flipped domain into the master, the prune correctly detects and
(in a real run) would remove it from every list it's on.
2026-06-08 03:39:56 -05:00
justin
54b92b1f06 fix(hc deliverability): MX-based Google-host exclusion during warmup
Found via live mail.log: Google-Workspace-hosted PRACTICE domains (custom
domains whose MX is aspmx.l.google.com, e.g. moosepharmacy.com, hc2kidney.com)
were getting hard 550-5.7.1 rejects from Google's cold-IP bulk filter -- exactly
the bounces that wreck a warming IP's reputation. The original google/non-google
split classified by the email's domain STRING, which can't see that a custom
domain silently uses Google Workspace; only an MX lookup reveals it (33% of our
domains, 228/689, are Google-hosted this way).

- hc_data_refresh.py: new MX classification (one lookup per unique domain via
  dnspython, cached) writes an mx_provider=google/other flag into the master and
  propagates it into the channel CSVs (auto-adding the column). --skip-mx for a
  fast status-only run.
- build_healthcare_campaigns_cron.py: warm_segment now drops mx_provider=google
  rows during warmup (HC_SKIP_GOOGLE=1 default; set 0 once IPs are warm). This is
  defense-in-depth -- correct regardless of which CSV the cron is pointed at.

Verified: today's sends (nongoogle CSV) had 0 Google bounces; the guard cuts the
Google-containing week1_verified cohort's revalidation candidates 82->8.
2026-06-08 03:32:12 -05:00
justin
feb677f6ce fix(hc warmup): only mail slightly-overdue providers (deliverability)
Mailing heavily-overdue NPIs (months/years past due) risks hitting practices
that have closed, merged, or abandoned the inbox -> hard bounces, which are the
fastest way to wreck a warming IP's reputation. The warmup now restricts the
reval_overdue selector to an inclusive [HC_OVERDUE_MIN, HC_OVERDUE_MAX] window
(default 1-90 days) and the OIG 'any' selector likewise excludes heavily-overdue
and dropped-off-list rows. On the current cohort this trims the overdue audience
178->96 and the OIG audience 399->317, holding out the stale long tail
(181-365d + 366d+). upcoming/active providers are unaffected.
2026-06-08 03:27:22 -05:00
justin
c79a7715e1 fix(hc): bugs found in self-audit of the new refresh + warmup + templates
Refresh (hc_data_refresh.py):
- CRITICAL: drop optout_ending from REFRESHED_FIELDS -- the refresh never
  computes it, so propagating it blanked the channel CSVs and would starve the
  compliance_bundle segment (whose selector IS optout_ending).
- MAJOR: only rewrite leie_excluded when OIG was actually pulled (guard was
  'not skip_oig OR not skip_sam', so a --skip-oig run blanked all exclusion
  flags). Also write 'Y' (matching the original list builder) not '1'.
- Use 'no_reval_flag' (the original vocabulary) instead of 'not_on_list' when an
  NPI drops off the reval list, and clear reval_due_date too.
- Throttle politeness: move time.sleep(0.05) above the early-continue paths so
  EVERY CMS request is spaced, not just the minority that are on the list.
- Guard blank-NPI rows (leave their status untouched instead of mislabeling).
- Master write preserves any columns beyond HEADER (no silent column drop).

Warmup cron (build_healthcare_campaigns_cron.py):
- Fix the daily-slice split: it summed to less than the budget (dropped ~2/day)
  and could OVERSHOOT on tiny totals (each 'other' floored to >=1). Now uses
  divmod for an even remainder and reclaims rounding onto the lead, so
  sum(per_seg) == total_slice exactly for every input (verified 0,1,2,7,100,300).

Templates: the non-revalidation emails rendered {{ .Subscriber.Attribs.detail }}
(a reval due date) under a 'Practice'/'Status'/'Record' label -- a wrong/
confusing personalization on a live send (esp. OIG, selector 'any'). All four
now show the practice name; 'detail' is retired from rendering (revalidation
uses reval_due_date/days_overdue directly).
2026-06-08 03:23:47 -05:00
justin
85dc3d5c3b hc refresh: propagate fresh status into the channel CSVs the cron reads
The channel CSVs (hc_warmup_nongoogle/google/week1_verified) are email-keyed
subsets of the master with extra deliverability columns (verify_ok/verify_reason).
The refresh now writes the fresh status fields (reval_due_date, days_overdue,
reval_status, leie_excluded, optout_ending, name/specialty/state) back into each,
preserving the extra columns and row membership, so a single weekly run updates
everything the campaign cron consumes -- not just the master.
2026-06-08 03:13:00 -05:00
justin
4f455475c0 hc: weekly data-refresh pipeline + multi-segment warmup cron
Two gaps closed:

1. hc_data_refresh.py (NEW): weekly source-data refresh. Re-checks every
   emailable NPI against the LIVE government sources so sends never go stale:
   - CMS Revalidation Due Date List (data.cms.gov per-NPI API; handles both ISO
     and US date formats, normalizes to MM/DD/YYYY).
   - OIG LEIE full CSV download (the NPI-bearing exclusion source).
   - SAM.gov v4 exclusions (key in .secrets/sam-api-key) -- OFF by default since
     SAM exclusions rarely carry an NPI and the full set is ~167k records; it's
     opt-in via --sam-pages. SAM's real value is the live per-name screening
     service, not a bulk NPI join.
   Writes the master CSV atomically (temp+rename). A provider who has since
   revalidated flips overdue->upcoming/not_on_list, so we stop nagging them.

2. build_healthcare_campaigns_cron.py: was revalidation-only (one hardcoded
   list/campaign/CSV/template). Now multi-segment: imports SEGMENTS from the
   single-source-of-truth registry, warms ALL five programs in parallel, each
   with its own list, dated campaign, and per-segment import-state file (so
   dedup is per-segment). A  per segment maps master-CSV rows to the
   right program (reval_overdue / reval_upcoming / leie_or_deactivated /
   optout_ending / any). Daily ramp slice is split across segments (revalidation
   leads at 50%, rest share the remainder) so every program collects engagement
   data while the IPs warm. Back-compat: seeds revalidation import-state from the
   legacy hc_imported_emails.txt once.
2026-06-08 03:06:29 -05:00
justin
0b0ff9d311 hc campaigns: make the HTML templates the single source of truth
build_healthcare_campaigns.py had a divergent inline HTML generator (old teal-
header + yellow issue-box layout, missing the official-record card and the per-
segment verify-it-yourself blocks) that nobody called -- the live cron reads the
hand-tuned data/hc_campaigns/hc_*.html files directly. Removed the dead
generator + cmd_render(); render() now READS the canonical template file so the
files can't drift from a parallel generator. SEGMENTS is now a metadata registry
(subject, template, cta_path, price, list_name, campaign_name, selector) that the
multi-segment cron will consume. Verified --list and that send-test still reads
the real bodies.
2026-06-08 02:57:49 -05:00
justin
483f185861 feat(healthcare): prove revalidation is real via official CMS data + self-verify
Skepticism ("is this even real?") is the top objection. The data IS accurate
(verified our subscribers' NPIs match the official CMS Revalidation Due Date List
exactly), so this is a credibility-presentation fix:

1. Email: replace the plain detail row with an "Official record - CMS Medicare
   Revalidation Due Date List" card (NPI, legal name, due date, days overdue)
   plus a "Verify on CMS.gov" button. Clearly labeled as our presentation of
   public CMS data, not a CMS screenshot (no impersonation).
2. API: npi/lookup now pulls the revalidation due date LIVE from the public CMS
   dataset (data.cms.gov) instead of the empty local table, and returns a
   revalidation{ due_date, source, cms_legal_name, verify_url } proof object.
3. Tool: /tools/npi-compliance-check shows a live "official record" card with a
   self-verify link when CMS returns a due date.

Builder now stores reval_due_date/days_overdue as separate attribs for the card
(existing 194 subscribers backfilled from their detail string).
2026-06-07 23:54:01 -05:00
justin
a732423f04 fix(deploy): port catalog generator + drift-check to Python (prod has no node)
The host-side generator ran 'node scripts/*.mjs' in deploy.sh, but the prod box
has python3 only (no node outside containers), so the site deploy failed at the
generation step. Reimplemented both in Python (byte-identical output, verified
via diff against the node version; matches scripts/sync_nav.py tooling).
2026-06-07 19:26:01 -05:00
justin
09e21a6c97 refactor(pricing): single source of truth for the service catalog
Previously two hand-maintained price lists (API COMPLIANCE_SERVICES + site
SERVICE_META) drifted apart -- that is how the healthcare +$200 raise charged
$399 while displaying $599. Eliminate the drift class entirely:

- Move the catalog to api/src/service-catalog.ts (the authority; checkout
  charges from it). compliance-orders.ts imports it.
- scripts/gen-service-catalog.mjs generates site/src/lib/service-catalog.generated.ts
  from the API source. intake_manifest.ts re-exports SERVICE_META from it, so all
  ~60 site pages keep working unchanged.
- deploy.sh regenerates + drift-checks before building (site build context is
  ./site only and cannot read ../api, so generation happens host-side).
- scripts/check-service-catalog-drift.mjs fails the build if the generated file
  ever diverges from the API (verified: passes aligned, fails on mismatch).

To change a price now, edit ONE file: api/src/service-catalog.ts.
2026-06-07 19:11:34 -05:00
justin
a4d67bcf9b hc-warmup: add list-hygiene script (drop undeliverable addrs, smtp_valid first)
Keeps only deliverable addresses (smtp_valid + catch_all_detected), drops
mx_unreachable + smtp_unknown rejects that defer/bounce and damage the warming
HC IP reputation. Sorts smtp_valid first so the daily slice hits verified
mailboxes first. Used to clean hc_warmup_nongoogle.csv (501 -> 399 rows).
2026-06-07 18:08:36 -05:00