Customer portal login previously checked a bcrypt customers.password_hash
in Postgres, while portal.performancewest.net validated against ERPNext —
two stores that drifted (the Paul Wilson lockout). Consolidate on ERPNext:
- erpnext-client: add verifyWebsiteUserPassword() — delegates the credential
check to Frappe /api/method/login (Host header = site name; 200=ok,401=bad).
- portal-auth /login: verify against ERPNext, then mint the pw_customer cookie.
- portal-auth /register: create+set the ERPNext password (authority) and upsert
a password-less customers profile row; takeover guard still honors any legacy
PG password until the column is dropped.
- portal-auth /reset-password + /forgot-password: write the new password to
ERPNext; forgot-password now also works for ERPNext-only users (creates the
PG profile row on demand).
- Legacy customers with only a PG bcrypt password reset via forgot-password.
- checkout: refresh the stale comment (customers row is now a profile, no pw).
Build + typecheck green.
Root cause of recurring 'Password not found for Email Account Performance West
Outgoing': the account was shipped as a fixture with awaiting_password=1 and no
password. Email Account SMTP passwords are encrypted per-site and cannot live in
a fixture, so every `bench migrate` reimported the fixture and re-broke
outgoing mail (login notifications, password resets, welcome emails).
- Remove the Email Account fixture (it cannot carry the encrypted secret).
- Add email_account_sync.sync_outgoing_password: idempotent, exception-safe
upsert that reconciles the account + password from SMTP_* env and clears
awaiting_password.
- Wire it to after_migrate (repairs at end of every deploy/migrate, right after
fixtures import) and the daily scheduler (heals out-of-band restore/restart
drift).
- Pass SMTP_* into the erpnext + erpnext-scheduler containers so the sync has
the secret (they previously had no SMTP env).
The verifier returned (True, 'mx_unreachable') when it couldn't complete a port-25
probe to ANY MX — marking 438,163 addresses email_verified=TRUE. But these are NOT
dead: they're dominated by Comcast (13.7k), AT&T/SBCGlobal (13.5k), Verizon, Cox,
Charter, Frontier, etc. — major ISPs that deliberately tarpit/refuse probes from
unknown IPs. Confirmed from prod: comcast MX connects + returns 220. The probe
failure ≠ undeliverable.
Fix: return (False, 'mx_probe_blocked') — MX exists, deliverability UNKNOWN, must
be confirmed by a real send. Excluded from PW campaigns; prime burner-verification
target (burner_list_verify upgrades it to send_confirmed on delivery). Existing
438,163 mx_unreachable rows reclassified in prod to mx_probe_blocked / verified=FALSE.
The smtp_valid pool is only ~3k unsent — too small to sustain campaigns. SMTP
probing can't confirm catch-all/mx_unreachable deliverability; only a REAL send
can. burner_list_verify.py reconciles a verification send from a DISPOSABLE burner
domain (isolated from PW/carrierone reputation):
- hard bounce -> fmcsa_carriers.email_verify_result='hard_bounced' (excluded)
- delivered -> 'send_confirmed' (proven deliverable; PW campaigns send to it)
It tails the burner MTA mail.log (reuses bounce-watcher's status= pattern) and
writes back idempotently. The PW trucking filter now treats smtp_valid +
send_confirmed as sendable. docs/campaign-deliverability-plan.md captures the full
diagnosis, the burner design, and CAN-SPAM guardrails.
Remaining (needs a domain + isolated MTA identity — operator/infra decision):
stand up the burner domain, the verification-send worker, and a writeback cron.
Root cause of zero conversions since Jun 9 + the Gmail/Outlook block storm:
the send filter was '(email_verified IS TRUE OR result IN ...)'. The verifier
sets email_verified=TRUE optimistically for mx_unreachable (domain exists but
its mail server never answered the RCPT probe) — 438,163 such rows. Those HARD
BOUNCE on send, producing ~1,100 bounces/day (~47% rate) and blocklisting half
the 120k subscriber base, so real prospects never saw the offer.
Fix: key the send filter ONLY off email_verify_result, never the broken boolean.
Recovery mode (default): send only 'smtp_valid' to drive bounce rate to ~0 and
rebuild reputation; set CAMPAIGN_INCLUDE_CATCH_ALL=1 to re-add catch-all domains
once recovered. Mirrors the healthcare list-cleaning approach (HC bounces ~2-3%,
which proves the fix). Note: only ~3k smtp_valid unsent remain — list growth via
real-send bounce verification (separate burner domain) is the follow-up.
The MCS-150 intake-completion email linked customers to /order/dot-compliance,
which is the sales/checkout page -- it ignores ?order= and asks the customer to
re-pick services and pay again, so they 'cannot enter any data' (Paul Wilson's
report). Link to the per-service intake wizard /order/<slug>?order=... instead,
which loads the paid order, pre-fills from the FMCSA census, and drops payment.
Also add a Trailers field to the DOT intake fleet section and wire it through to
the MCS-150 PDF Q26 trailer row, so carriers can update trucks AND trailers.
- Added a line asking them to call their insurance agent to confirm Form E
ability before clicking yes/no, so we pick the right path first time.
- Reply-To now routes to info@performancewest.net (monitored), overridable via
SC_COC_REPLY_TO env.
deploy.sh used 'git pull origin main', which silently ABORTS when the tracked
tree is dirty (generated site files, or any drift), stranding new commits on an
old checkout — this bit us twice today (prod stuck at b125d46 while origin had
the COC work). Replaced with:
git fetch origin main && git reset --hard origin/main
The deploy box is a pure mirror of origin (all real changes land via git), so a
hard reset is safe and untracked files (data/*, .secrets/) are preserved. Added
a post-reset assertion that HEAD == origin/main and exits 1 loudly otherwise, so
a strand can never again be masked by a '| tail' in the caller.
The Dockerfile copies form PDFs explicitly by name; the SC COC template was
missing, so fill_sc_coc() would FileNotFoundError in the container. Added it.
Routes SC intrastate-authority orders to the real SCDMV COC product instead of a
PSC certificate (which doesn't apply to property carriers):
- sc_coc_filing.py: emails the carrier a one-click yes/no — does your insurer
have / can they file a Form E (SC intrastate liability, $750k or $300k by
GVWR) with SCDMV? Records the answer; builds the filled COC package.
- state_trucking._handle_sc_coc_gate: SC intrastate gate —
no answer -> email the question once, HOLD
answered no -> broker referral opened, HOLD (ops todo)
answered yes-> proceed to bill the exact $25 SCDMV COC fee (at cost) + file
- API POST /compliance-orders/:id/sc-insurance: records yes/no in intake_data
(no schema change); NO opens an insurance_lead broker-referral ticket +
Telegram; YES re-dispatches the worker to bill the $25 + file.
- site/order/sc-insurance: customer one-click yes/no page (auto-submits when
the email links straight to ?have=yes|no).
Non-SC intrastate still uses the PSC/PUC email path or a manual todo.
The custom_incorporation_province field had default='BC', which stamped 'BC'
on EVERY Sales Order (US trucking, formation, compliance) — not just Canadian
CRTC orders. This leaked a meaningless 'BC' onto e.g. an SC scrap-metal carrier's
order. Removed the default and added a blank option so it's empty unless it's an
actual Canadian incorporation. Existing non-canada_crtc orders cleared in prod
via db_set (13 fixed; the 2 real canada_crtc orders keep BC).
SC for-hire PROPERTY carriers (not passenger/HHG/hazwaste) register intrastate
via the SCDMV Certificate of Compliance (COC), not a PSC certificate. This adds:
- sc_coc_pdf_filler.fill_sc_coc(): fills the official SCDMV Form COC from
intake (business name, officers, physical/mailing address, phone), picks
New vs Renewal, and stamps the coverage class (E-L low-value / E-LC).
Field names in the source PDF are auto-generated + offset from their labels;
mapped here by verified on-page geometry. Verified by render.
- suggest_coverage_class(): E-L for low-value cargo (scrap/dump/aggregate),
else E-LC (safer default).
- gov_fee: SC intrastate fee corrected from $0 placeholder to the real $25
COC new-application fee (renewals $0), billed at cost.
The carrier's INSURER files the Form E (liability) + Form H (cargo, E-LC only)
directly with SCDMV; we collect the COC app + $25 and submit it.
Non-attorney 'Service' filer account registered under Performance West
(filings@performancewest.net). Credentials live only in the server .env
(blank default in template, never committed). Consumed by the upcoming SC
intrastate Playwright e-filer.
Intrastate operating authority is state-specific + application-based like IRP, so
it reuses the same email/POA + invoice-reconciliation flow:
- intrastate_filing.send_intrastate_submission: emails the state PSC/PUC the
authority application with the signed POA attached (subject tag [PW-ISA CO-..]),
reusing irp_filing's MinIO download + census enrich helpers.
- The shared poller (irp_invoice_poller) now matches BOTH [PW-IRP] and [PW-ISA]
tags, parses the fee, Telegram-alerts, and bills the customer the exact amount
with the correct service slug.
- state_trucking gov-fee gate routes intrastate-authority to the PSC/PUC email
path; if no submission email is configured for the base state it falls back
to a manual todo (safe default — no emailing guessed agency addresses).
Per-state ISA_<ST>_EMAIL env (blank until the exact agency address is verified).
SC/GA/TX scaffolded. Customer still only sees an exact-fee payment link; you only
approve the final filing.
deploy.sh ran sync_nav.py / gen-service-catalog.py which dirty site/public +
site/src in place; that made 'git pull' abort, so recent commits never reached
prod until pulled manually. Reset those generated paths before pulling so deploys
always fast-forward. Also document the IRP POA signer-name/title follow-up.
- send_irp_submission now REQUIRES and ATTACHES the signed Power of Attorney PDF
(downloaded from MinIO) — the state won't act on a third-party filing without
it, and 'on file, available on request' stalls the request. If the POA isn't
available we don't email and fall back to a manual todo.
- Backfill missing legal_name + registered address from the FMCSA census so the
submission isn't sent with a blank address (root cause of the empty
'Legal/registered address: , ,' line). Customer-supplied values win.
- state_trucking passes signed_auth_key through to the IRP submitter.
- Fix 'Object of type date is not JSON serializable' when creating the admin
todo (json.dumps(..., default=str)) — broke the intrastate (bash-fee) path.
The gov-fee email now lists exactly what the amount covers (full breakdown) so
the customer can check it for accuracy, with two clear actions: a ✅ pay link and
a ❓ 'something looks wrong' link to /order/dispute.
New /order/dispute page shows the fee breakdown and lets the customer describe
what's wrong; it opens an 'issue' support ticket pre-tagged with the order
(amount + label + their note) via /api/v1/tickets, so ops corrects the fee
before any payment is taken. The /order/pay page also shows the itemized
breakdown and a dispute link.
- gov_fee: add AGENCY_PROCESSING_FEE (per-service card/convenience fee passed
through so the customer pays the true all-in cost); estimate_gov_fee now folds
it into the billed total. IFTA/intrastate/UCR fees are published/near-exact.
- IRP fees can't be looked up — only the base state computes them. New
irp_filing.py: emails the base-state IRP unit a Schedule A/B request (Reply-To
the IRP filings mailbox, [PW-IRP CO-...] subject tag), and a 15-min cron
(irp_invoice_poller) scans the mailbox for the state's invoice reply, parses
the exact apportioned fee, Telegram-alerts you, and bills the customer the
EXACT amount via a gov-fee child order + payment link. Then it proceeds to
ready_to_file for your final approval.
- state_trucking gov-fee gate now routes IRP to the email/invoice path and
IFTA/intrastate to immediate exact-fee billing.
- Mailbox is configurable (IRP_FILINGS_IMAP_* in app.env.j2); falls back to
OPS_IMAP_* filtered by the [PW-IRP] tag until a dedicated mailbox exists.
Telegram alerts fire on IRP submission sent, invoice received (billed), and
un-parseable replies (so you can read + enter the fee manually).
At-cost services (IRP/IFTA/intrastate) only collected our service fee at
checkout; the variable state fee was never billed, so orders stalled at
authorization_signed and the filing card would have had to front large IRP fees.
New end-to-end, hands-off flow (you only approve the final filing):
1. After authorization is signed, state_trucking auto-estimates the gov fee
from intake (base/op states, power units, weight) via gov_fee.estimate_gov_fee.
2. Creates a CHILD compliance order (CG-..., service_fee=0, gov_fee=estimate,
parent_order_number set, migration 099) that flows through the EXISTING
checkout/payment/webhook machinery.
3. Emails the customer a payment link to /order/pay (new self-contained page)
showing every method with correct surcharges — ACH 0% (Stripe 0.8%/ cap
absorbed, no GoCardless needed), card/PayPal 3%, Klarna 6%, crypto 0%.
4. Order holds at awaiting_government_fee_approval until paid.
5. On payment, handlePaymentComplete detects the child (parent_order_number)
and re-dispatches the PARENT with gov_fee_paid=true, which proceeds to
prepare + queue the filing and stops at ready_to_file for your approval.
IRP fees are estimates billed at cost (refund overage / rebill shortfall); IFTA
decals + most intrastate fees are near-exact. Tunable via env.
relay_integration.py line 34 called logging.getenv (no such attr), which threw
AttributeError on import -> load_card_from_erpnext() crashed for every caller
(BOC-3 and now UCR filing payment). Drop the bogus line; LOG is set correctly on
the next line. Present since the initial commit.
Adds scripts/workers/services/ucr_playwright.py — a UCR.gov National Registration
System automation that, given a USDOT + fleet size, runs the register/pay flow,
pays the federal UCR fee with the matched PW filing card (Relay/Stripe Issuing),
and captures a confirmation screenshot + number. Conventions match
boc3_playwright / fmcsa_web_submitter: dev-mode dry-run guard, undetected
(patchright) browser, CAPTCHA detection, screenshot evidence, dataclass result.
Safety: verifies the displayed fee against the federal schedule before paying and
refuses to auto-charge a surprising amount (UCR_MAX_AUTO_FEE_USD) — falls back to
manual filing instead.
Wires it into MCS150UpdateHandler: when an approved (admin_approved) order has
slug ucr-registration, _file_ucr_registration runs the automation, uploads the
confirmation screenshot to MinIO, records filing_status + confirmation, and sets
fulfillment_status=completed on success. On CAPTCHA / fee-mismatch / failure it
reverts to ready_to_file with a high-priority 'file manually' todo. This replaces
the old behavior where approving a UCR just sat at authorization_signed.
Admin-assisted services (UCR, MC authority, etc.) have no automated submission,
so approving them only flips to authorization_signed and then sits there -- there
was no way to advance to completed. Add POST /mark-filed (filed_waiting_state |
completed, optional confirmation #, transactional + audit-logged) and drawer
buttons 'Mark as filed (waiting on agency)' / 'Mark completed' shown for orders in
authorization_signed / ready_to_file / filed_waiting_state. Confirmation number
is recorded into intake_data.filing_status.manual_confirmation.
UCR (and other admin-assisted DOT services) route through MCS150UpdateHandler,
which hardcoded 'MCS-150' and self.SERVICE_SLUG in the admin todo, the Telegram
fulfillment notification, and the customer status email -- so approving Paul's
UCR produced an 'MCS-150 Review / mcs150-update / PDF: not generated' alert and
an 'MCS-150 biennial update' customer email, both wrong.
Add SERVICE_DISPLAY_NAMES + _service_label(slug); use the actual slug everywhere.
Admin-assisted services now show 'UCR Annual Registration — FILE NOW ... file
manually on the portal (no auto-generated form)' instead of MCS-150/PDF wording,
and the customer email names the right service.
- Documents now flag is_image and the drawer renders screenshots / confirmation
images as inline clickable thumbnails (click to open full size); PDFs keep the
View link. Evidence keys are labeled (Filing confirmation screenshot, etc.),
the worker-temp screenshot_path (not a MinIO key) is dropped in favor of the
durable evidence copy, and non-file evidence (fax_log_id) is skipped.
- Wrap approve's status-update + audit-insert in a transaction so a failure can
no longer leave an order out of ready_to_file without dispatching (the earlier
audit CHECK violation did exactly that to Paul's UCR; it has been reset).
The admin compliance-orders approve/re-arm actions write order_audit_log rows
with order_type='compliance', but the CHECK constraint (from migration 004)
only allowed formation/service/quote -- so every approve failed with a 500
('Approve failed.'). Expand the constraint to include compliance + compliance_batch.
Admin-assisted DOT services (UCR, BOC-3) routed to this handler were marked
ready_to_file with whatever intake existed -- e.g. a UCR with only a DOT number,
missing legal name / state / fleet-size bracket (which sets the UCR fee tier).
That made the admin 'ready to file' status dishonest and unfileable.
Now, for ADMIN_ASSISTED_REQUIRED services we first enrich intake from the FMCSA
census (legal_name, address_state, power_units) + the order email, and derive
the UCR fleet_size_bracket from power units (UCR_FLEET_BRACKETS). If every
required field is then present we persist it and mark intake validated (falls
through to the admin review gate -> ready_to_file). If anything is still
missing, we persist what we have, set fulfillment_status=awaiting_intake, and
email the customer to complete intake -- instead of falsely showing ready_to_file.
Filter the documents list to objects that exist in storage, so stray keys (a
template pdf_minio_path, or a phantom mcs150 esign_records row on a UCR order
from the shared remediation pipeline) no longer surface as dead rows. The UI
drops the now-unreachable 'not generated yet' branch.
The dot-compliance-remediation pipeline seeds filing_status.pdf_minio_path on
every order in a batch, but only MCS-150-producing slugs (mcs150-update,
dot-registration, usdot-reactivation, dot-full-compliance) ever generate it.
For admin-assisted services like UCR it was a phantom 'Prepared filing PDF /
not generated yet' row. Gate the prepared-filing artifacts on FORM_PRODUCING_SLUGS
(mirrors the worker's MCS150_FORM_SLUGS) and give the empty state a clearer
explanation.
Paul Wilson's UCR (CO-FE07212A) sat at fulfillment_status=ready_to_file with
intake_data_validated=false, so the Approve & File button would have dispatched
it for government submission with incomplete intake and no document to review.
Backend: /approve now refuses an order whose intake_data_validated is false
unless {force:true} is passed (409 code=intake_incomplete); the override is
recorded in order_audit_log. The fulfillment_status=ready_to_file requirement
is unchanged, so awaiting_intake orders (e.g. Mitchell's MCS-150s) still 409.
UI: the drawer shows an amber 'intake not complete' warning above the approve
button, and approving an intake-incomplete order triggers an explicit
override confirmation before sending force=true.
The DB can record a pdf_minio_path before the object is uploaded (e.g. a
prepared-filing path written for an order whose prep never completed -- Paul
Wilson / Mark Adams MCS-150s). The documents list now HEAD-checks each key and
returns an exists flag; the UI shows 'not generated yet' instead of a dead View
button, and the stream endpoint returns a clean 404 for a missing object.
Adds a Documents section to the compliance-order detail drawer so you can
review the actual filing PDFs before approving an order:
GET /api/v1/admin/compliance-orders/:id/documents list viewable objects
GET /api/v1/admin/compliance-orders/:id/document?key=&token= stream one
Key discovery pulls from esign_records (unsigned + signed docs per order),
intake_data.filing_status (pdf_minio_path, attested_pdf, evidence/*), and the
order's engagement_letter / rmd_packet columns.
Rather than hand out presigned URLs (MinIO's public host is IP-allowlisted to a
few office IPs, so links break elsewhere), the API streams the object through
itself from internal minio:9000, gated by the admin JWT. The stream endpoint
accepts the token via ?token= (new middleware requireAdminQueryOrHeader) so a
PDF opens in a new tab, and refuses any key that isn't one of the order's own
documents.
The shared security snippet blocked any path matching /(admin|administrator|
login.action|struts) with 'return 444', which drops the connection. That bare
'admin' token also matched our own operations dashboard at /admin and the new
/admin/compliance-orders, so the browser showed 'This site can't be reached'.
Dropped the bare 'admin' token; administrator/login.action/struts stay blocked.
Applied live on prod (sudo edit + nginx reload); this updates the source of
truth so the ansible nginx role won't reintroduce it.
The admin SPA only managed formation_orders; compliance service orders
(telecom/DOT/healthcare) had no admin surface, so you couldn't see what was
paid, what was stuck on intake, or approve a prepared filing for submission.
API (api/src/routes/admin.ts), all requireAdmin:
GET /api/v1/admin/compliance-orders list, grouped by batch, filters
GET /api/v1/admin/compliance-orders/stats queue overview counts
GET /api/v1/admin/compliance-orders/:id full detail + audit log
POST /api/v1/admin/compliance-orders/:id/approve approve ready_to_file + dispatch worker
POST /api/v1/admin/compliance-orders/:id/rearm-intake clear reminder stamp so daily nudge resumes
UI: new static page /admin/compliance-orders/ (self-contained, CSP-safe inline
CSS, no external JS framework) reusing the existing pw_admin_token session.
Cards group multi-service batches, flag paid+intake-incomplete in red, show
reminder counts, and expose Approve & Re-arm buttons. Linked from the main
/admin top bar. Every approve/re-arm writes an order_audit_log entry.
Companion to the worker MinIO-retry fix. Makes the worker auto-recover from
process death (crash, manual kill, missed boot trigger), not just MinIO outages.
- start_worker.bat: propagate Python's exit code (exit /b %rc%) so Task
Scheduler can actually detect a failed run (it previously always exited 0).
- reconfigure_task.ps1 (new): re-registers PW-DocserverWorker with
RestartCount=99 / 1-min interval, StartWhenAvailable, and two triggers —
AtStartup plus a 5-min repeating trigger with MultipleInstances=IgnoreNew, so
a dead worker relaunches within ~5 min and never double-runs. Idempotent.
- install.ps1: same self-healing settings for fresh installs.
- Verified on the box: killed the worker -> task relaunched it; firing again
while running stayed at one instance.
Docs updated to match reality:
- docserver/README.md: new 'Reliability / self-healing' section.
- document-generation.md: corrected the stale 'Flask DocServer :5050 / HTTP'
description to the actual MinIO outbound-only transport.
- e2e-test-plan.md: removed the outdated 'Word COM fails under SYSTEM / requires
RDP after every reboot' limitation; now self-healing under SYSTEM session 0.
- infrastructure.md: fixed VM spec (Win Server 2019, Word 16.0, Python 3.13,
SSH port 22422) + self-healing note.
- architecture.md / formation-system.md: trigger + self-healing details.
The worker called sys.exit(1) on any MinIO connection error, so a single
transient 502 from MinIO/its reverse proxy left it dead until a manual restart
or reboot (its scheduled task only runs at system startup). It had been dead
~5 weeks after a 502 on May 9.
- _connect_minio_forever(): retry the initial MinIO connect indefinitely with
capped exponential backoff (5s..120s) instead of exiting.
- main loop: wrap each poll cycle; on any error, log + rebuild the client and
keep polling rather than crashing.
Verified on the box: normal DOCX->PDF still works (~11s e2e); a bogus endpoint
now retries forever without ever calling sys.exit (was the exact May-9 failure).
Two of our three real paid customers (Mark Adams / mark@adamslumber.com and
Paul Wilson / synthetic@pipeline.com) never completed intake. They each hit the
old hard cap of 10 daily reminders (last sent Jun 12 / Jun 13) and the worker
then went permanently silent -- the last two daily runs reminded 0 orders even
though both still owe us intake on paid work. (The third, mitchell allen /
mitchell@allenscrapmetal.com, did complete intake; his orders are dispatched.)
Replace the dead-stop cap with a two-phase cadence:
- daily for the first DAILY_PHASE (10) nudges -- the initial burst,
- then weekly (WEEKLY_INTERVAL_DAYS) up to an absolute MAX_REMINDERS (60),
so a paid order with missing intake keeps getting a gentle nudge instead of
being dropped. Tunable via INTAKE_REMINDER_DAILY_PHASE /
INTAKE_REMINDER_WEEKLY_INTERVAL_DAYS / INTAKE_REMINDER_MAX. Clearing
intake_reminder_last_at re-arms an order immediately (documented in the
module docstring).
Main pool is calendar-day 12 but reputation is wrecked (54% delivery, Gmail+
Outlook blocks) -- NOT warmed. MX tagging confirmed the cause: 702k carriers on
Google + 135k on Microsoft = the warmup was hammering exactly the two operators
blocking us. Hold Google/MS/Proofpoint/etc. OUT entirely until day 30 (configurable),
sending only to the long-tail operators (yahoo/comcast/charter/centurylink/etc.)
that don't bot-throttle, so reputation can recover; then re-introduce big
operators gradually via mx_daily_caps. 1.24M/1.49M carriers now tagged.
The real bottleneck was the write, not DNS: each per-domain UPDATE full-scanned
fmcsa_carriers (no functional index on lower(split_part(email,'@',2))). Resolve
all domains concurrently into a list, load a temp table, then ONE join-UPDATE =
single table scan. Tags ~12k domains -> hundreds of thousands of carriers fast.
The serial path (verifier's 8s+6s lifetime per domain) was far too slow for
bulk tagging -- 0 tagged in 3 min on dead domains. Self-contained fast resolver
+ ThreadPoolExecutor(40) resolves thousands of domains in minutes.
The Jun 13-14 Gmail+Outlook block storm came from the main/trucking pool having
NO per-MX throttling (only HC had it) -- it concentrated warmup volume on
Google/Microsoft-Workspace-hosted business domains. Port the HC fix:
- migration 097: fmcsa_carriers.mx_provider column.
- mx_tag_carriers.py: resolve MX once per distinct domain (reuses the verifier's
classifier+cache), tag every carrier with that domain's operator. Bounded per
run, prioritizes unsent verified carriers.
- build_trucking_campaigns: during warmup (day<=6) EXCLUDE tagged Google/MS/
Proofpoint/etc. carriers in fetch_carriers; per-MX cap in select_sendable_
carriers so known operators never dominate the quota. Untagged carriers pass
(not collapsed onto one bucket) until tagging fills in. mx_daily_caps ramps
with the main warmup day; MAIN_SKIP_BIG_MX=0 disables once warmed.
The HC warmup crons were '* * 1-5' (Mon-Fri), silently skipping weekends -- but a
proper warmup needs CONTINUOUS daily volume for 21 days (mailbox providers reward
consistency; gaps stall reputation). The Jun 14 'HC 0 sent' alert was just a
skipped Sunday, but the weekend skips also broke ramp continuity.
- pw-hc-campaign + pw-hc-nppes: '* * 1-5' -> '* * *' (daily), vendored + applied live.
- Re-aligned the warmup start stamp from calendar-day 9 to send-day 5 so the
volume ramp matches reputation actually built (it had skipped ~4 weekend days,
running the ramp ahead of real history).
- Fixed the stale 'Mon-Fri only' comment in daily_slice().
- Vendored nppes cron now carries the enriched-CSV + 4-segment config.
otc_reincorporation.html: redomesticate-to-Texas hook (Business Court + TXSE +
DE franchise-tax cost) personalized by state_inc_name/company/ticker, cross-sell
RA/foreign-qual/annual-report/franchise-tax, same-day coupon, lead-capture CTA to
/contact?service=reincorporation (high-touch corporate service, not self-serve),
careful 'not a law firm / not legal advice' disclaimers + CAN-SPAM address.
build_otc_campaign.py: emails only verified-email issuers from the harvest+scrape
+verify pipeline, --de-nv-only for the best reincorp fit, reuses trucking sender
plumbing + coupon. Per-deal value is high so capped modestly (400/run default).
scrape_otc_emails.py: fetch each issuer domain's IR/contact pages (gzip,
HTML-only, early-abort, prefer ir@/investor@/info@), extract a contact email.
Skip filing-agent domains (DFN/Donnelley/Broadridge/etc.) that leak into the
extracted domain -- those are not the issuer's site. Same filter added to the
harvester's DOMAIN_NOISE for future runs. Phone (100%) is the fallback channel
for email misses.
Pilot -> production: harvest_otc_issuers.py pulls the OTC/None universe (2,771),
keeps US-domestic (requires BOTH a US state-of-incorporation AND a US-state
business address -- disambiguates the 'DE'=Delaware-vs-Germany trap that leaked
Infineon etc.), and extracts each issuer's website DOMAIN directly from its
latest 10-K/8-K/DEF-14A filing (free, no scrape; ~58-60% find rate in testing).
Outputs cik,name,ticker,state_inc,phone,city,state,zip,domain -- ready for the
domain->email scrape + verify step. Phone is 100% (clean fallback call channel).
Reincorporation-to-TX / RA / foreign-qual / franchise-tax / annual-report fit.