new-site

Author	SHA1	Message	Date
justin	899b880e7f	trucking: weekly FMCSA source refresh so new non-compliant carriers are caught The FMCSA census was a one-time snapshot (last loaded ~May 30) with NO refresh timer -- carriers newly falling out of MCS-150/UCR compliance were never picked up. New scripts/workers/fmcsa_source_refresh.py orchestrates the full pipeline (census download -> enrichment -> deficiency flag -> verify new emails -> MX-tag new) and runs weekly via cron pw-fmcsa-refresh (Sun 09:00 UTC), codified in the mail-pipeline Ansible role. Idempotent + incremental: the census upsert preserves email_verified / listmonk_sent_at / deficiency_flags, so existing carriers keep their send state and only census fields refresh; new DOTs flow into verification then campaigns. A carrier who refiled gets a fresh mcs150_parsed, so the builder's overdue WHERE clause stops targeting them automatically. Verify is capped per run (20k) so it never stalls on millions of rows. (Healthcare already auto-catches newly-revalidation-overdue providers within its 63k institutional pool via pw-hc-refresh Mon/Wed/Fri.)	2026-06-17 20:44:54 -05:00
justin	4dc5690666	infra: codify the email-campaign pipeline in Ansible (new mail-pipeline role) The entire outbound campaign pipeline lived ONLY on the host and was never in IaC -- a fresh rebuild would have silently shipped NO campaigns, NO IP warmup/ ramp, and NO bounce processing. New mail-pipeline role + deploy-mail-pipeline.yml playbook deploy it from the canonical repo copies: cron.d (infra/cron/): - pw-trucking-campaign-builder, pw-ifta-campaign, pw-ucr-campaign - pw-hc-campaign, pw-hc-nppes, pw-hc-refresh - pw-mta-warmup, pw-listmonk-rampcap, pw-hc-rampcap - pw-ip-rehab, pw-warmup-tg-alert helper scripts (-> /usr/local/bin): - pw-mta-warmup, pw-listmonk-rampcap, pw-hc-rampcap, pw-warmup-tg-alert - postfix-bounce-notify.sh, postfix-hc-bounce-notify.sh, listmonk-bounce-sync.py systemd services: - pw-bounce-watcher.service (was missing from repo), pw-hc-bounce-watcher.service Also creates the deploy-owned {{project_dir}}/logs dir (deploy can't write /var/log, so a missing dir made cron redirects fail). Added the 6 cron.d files that existed only on the host, the trucking bounce-watcher unit, and synced infra/cron/pw-hc-refresh to the live version (revalidation download + enrich steps). Role wired into site.yml after the mail (OpenDKIM) role. Part of the email-deliverability incident hardening.	2026-06-17 20:26:01 -05:00
justin	2e4388a803	mail: add logrotate for Postfix mail.log (postlogd copytruncate) mail.log had no logrotate rule and grew unbounded to ~1GB (~150MB/day) since Jun 8. This host logs via Postfix's built-in postlogd (maillog_file mode), not rsyslog (no rsyslog.service exists), so postlogd holds the file open -- a plain rename+create would leave it writing to the stale inode. Use copytruncate (no daemon signal needed). Rotate daily, keep 14 days compressed. Applied live: forced first rotation, compressed the 1GB archive (->99MB), verified logging + bounce watchers + DKIM signing intact. Part of the email-deliverability incident hardening (follows DKIM fix `4d59019`).	2026-06-17 19:47:13 -05:00
justin	4d5901921e	mail: fix OpenDKIM not signing campaign mail (Docker-injected) + codify in Ansible Root cause of the Jun 2026 deliverability collapse / 'no new sales': opendkim.conf was in single-key mode with no InternalHosts, so it signed only 127.0.0.1. Transactional/cron mail (injected locally) was signed, but ALL campaign mail -- injected over the Docker bridge from the Listmonk containers (172.18.0.5 trucking, 172.18.0.25 healthcare) -- went out UNSIGNED. Gmail/Yahoo require DKIM on bulk mail since Feb 2024, so cold campaigns were junked/blocked (~23% delivery, 550-5.7.1). Proof: 2,620 campaign msgs that day, 0 DKIM sigs. The correct table files already existed on the server but were never wired into opendkim.conf. Fix points the daemon at key.table/signing.table and sets InternalHosts/ExternalIgnoreList to trusted.hosts (which includes 172.16.0.0/12, the Docker subnet). Fixes BOTH streams: HC submission ports 2526-2528 inherit the global smtpd_milters and *@performancewest.net covers compliance@. Verified by injecting from a Docker IP through port 25 and port 2526 -- both now get 'DKIM-Signature field added'. Codified as new Ansible role 'mail' so it can't silently regress (OpenDKIM was previously not in IaC at all).	2026-06-17 19:31:19 -05:00
justin	01b3e1d234	chore(env): scaffold ISA_SC_DMS_USER/PASS for SC PSC MyDMS e-file portal Non-attorney 'Service' filer account registered under Performance West (filings@performancewest.net). Credentials live only in the server .env (blank default in template, never committed). Consumed by the upcoming SC intrastate Playwright e-filer.	2026-06-16 08:19:17 -05:00
justin	c27cfd3242	docs(crons): note IRP invoice poller now also handles intrastate [PW-ISA] replies	2026-06-16 07:59:38 -05:00
justin	b125d46663	feat(intrastate): automate state PUC/PSC authority filing (email + invoice + auto-bill) Intrastate operating authority is state-specific + application-based like IRP, so it reuses the same email/POA + invoice-reconciliation flow: - intrastate_filing.send_intrastate_submission: emails the state PSC/PUC the authority application with the signed POA attached (subject tag [PW-ISA CO-..]), reusing irp_filing's MinIO download + census enrich helpers. - The shared poller (irp_invoice_poller) now matches BOTH [PW-IRP] and [PW-ISA] tags, parses the fee, Telegram-alerts, and bills the customer the exact amount with the correct service slug. - state_trucking gov-fee gate routes intrastate-authority to the PSC/PUC email path; if no submission email is configured for the base state it falls back to a manual todo (safe default — no emailing guessed agency addresses). Per-state ISA_<ST>_EMAIL env (blank until the exact agency address is verified). SC/GA/TX scaffolded. Customer still only sees an exact-fee payment link; you only approve the final filing.	2026-06-16 07:57:57 -05:00
justin	ea695d6828	feat(govfee): exact fees + agency processing fees; IRP email/invoice reconciliation - gov_fee: add AGENCY_PROCESSING_FEE (per-service card/convenience fee passed through so the customer pays the true all-in cost); estimate_gov_fee now folds it into the billed total. IFTA/intrastate/UCR fees are published/near-exact. - IRP fees can't be looked up — only the base state computes them. New irp_filing.py: emails the base-state IRP unit a Schedule A/B request (Reply-To the IRP filings mailbox, [PW-IRP CO-...] subject tag), and a 15-min cron (irp_invoice_poller) scans the mailbox for the state's invoice reply, parses the exact apportioned fee, Telegram-alerts you, and bills the customer the EXACT amount via a gov-fee child order + payment link. Then it proceeds to ready_to_file for your final approval. - state_trucking gov-fee gate now routes IRP to the email/invoice path and IFTA/intrastate to immediate exact-fee billing. - Mailbox is configurable (IRP_FILINGS_IMAP_* in app.env.j2); falls back to OPS_IMAP_* filtered by the [PW-IRP] tag until a dedicated mailbox exists. Telegram alerts fire on IRP submission sent, invoice received (billed), and un-parseable replies (so you can read + enter the fee manually).	2026-06-16 04:58:14 -05:00
justin	d65f5ea279	nginx: stop blocking /admin (bot-scan rule matched our own dashboard) The shared security snippet blocked any path matching /(admin\|administrator\| login.action\|struts) with 'return 444', which drops the connection. That bare 'admin' token also matched our own operations dashboard at /admin and the new /admin/compliance-orders, so the browser showed 'This site can't be reached'. Dropped the bare 'admin' token; administrator/login.action/struts stay blocked. Applied live on prod (sudo edit + nginx reload); this updates the source of truth so the ansible nginx role won't reintroduce it.	2026-06-16 00:05:54 -05:00
justin	2caab6aa69	hc: warmup must run DAILY for the full 21-day ramp (not weekdays-only) The HC warmup crons were '* * 1-5' (Mon-Fri), silently skipping weekends -- but a proper warmup needs CONTINUOUS daily volume for 21 days (mailbox providers reward consistency; gaps stall reputation). The Jun 14 'HC 0 sent' alert was just a skipped Sunday, but the weekend skips also broke ramp continuity. - pw-hc-campaign + pw-hc-nppes: '* * 1-5' -> '* * *' (daily), vendored + applied live. - Re-aligned the warmup start stamp from calendar-day 9 to send-day 5 so the volume ramp matches reputation actually built (it had skipped ~4 weekend days, running the ramp ahead of real history). - Fixed the stale 'Mon-Fri only' comment in daily_slice(). - Vendored nppes cron now carries the enriched-CSV + 4-segment config.	2026-06-14 21:02:08 -05:00
justin	dd4ed3ea38	warmup: ROLL BACK main pool to 200/h after Gmail spam-blocked IPs at 400/h Day 9 (2026-06-13) alert: main pool 54% delivery, 202 Gmail spam-blocks (550-5.7.1 'Gmail has detected') on warming IPs .94-.98. The 4k/day (400/h) ramp was too aggressive AND the trucking pool lacks the per-MX throttling the HC pool got -- Google-Workspace-hosted business domains (weberfarms.net, uatruck.com, etc.) concentrated and Gmail blocked us. Held at 200/h (~2k/day) through day 20 to recover, then slow step to 300/h. Applied live (cap already set to 200/h).	2026-06-13 20:10:13 -05:00
justin	ff4ab262a8	hc: cron to feed NPPES institutional base (63k verified) into warmup, MX-throttled Adds /etc/cron.d/pw-hc-nppes (weekdays 07:30) that imports the verified NPPES institutional general-compliance base into the OIG screening segment, throttled per MX operator. Separate from the 07:00 reval-segment run so the two pipelines stay independent. Vendored the cron file under infra/cron/.	2026-06-12 22:11:12 -05:00
justin	887bf9a14a	warmup: grow main (trucking) pool faster -- 3k -> 4k/day now, 5k at day 14 The main sending IPs are cleanly warmed: today 3,845 sent at 0.18% bounce, ZERO deferrals, ZERO ISP rate-limit/blocklist/Spamhaus hits. The script's own note records these IPs historically sustained ~2,500/day at 68-76% delivery; collapses only ever came from 17k-29k spikes. So we have ample headroom to accelerate the trucking ramp safely: day 7-13: 300/h -> 400/h (~4,000/day) [applied now, day 8] day 14+: new 500/h (~5,000/day) [hard ceiling, well under ~17k] Also vendored pw-listmonk-rampcap into the repo (infra/postfix/) -- it previously lived only on the server at /usr/local/bin. Live script updated and applied (listmonk cap now 400/h).	2026-06-11 00:13:41 -05:00
justin	c8a0824143	firewall: allow ezstorehost (207.174.124.51) to reach Forgejo SSH Add ezstorehost to trusted_admin in both layers — the nft input set and the DOCKER-USER iptables chain (Forgejo is containerised; DNAT means the post-DNAT dport 22 rule applies). Required for static-tenant deploys from ezStorehost-infra to clone repos over ssh://. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 22:45:43 -05:00
justin	1854753c70	monitoring: add .91-.93 IP rehab to daily Telegram warmup alert Tracks the rehab pool (rehab02-04 / .91-.93) delivery + bounce + Spamhaus ZEN DNSBL status in the daily report and alert body. Alerts only if a rehab IP lands on a DNSBL or rehab delivery drops <40% with real volume (recipient quality slipped) -- a recovering IP naturally bounces more so the threshold is lenient.	2026-06-09 20:34:41 -05:00
justin	25f4a7503b	warmup: IP rehab for .91-.93 so they can be reallocated The 3 IPs (mta02-04 / .91-.93) retired after the May 30-31 over-volume blast are NOT on any DNSBL (Spamhaus/Barracuda/SpamCop/SORBS all clean) and have clean PTRs + SPF/DKIM/DMARC -- the damage was provider-internal reputation, which recovers with slow clean sending. scripts/ip_rehab.py sends a tiny ramping trickle (10/IP/day -> cap 60) of genuine CAN-SPAM-compliant compliance check-in mail to clean business-domain, never-bounced recipients via dedicated heavily-throttled postfix transports rehab02/03/04 (30s/msg, bound to .91/.92/.93). Routing uses an X-PW-Rehab-IP header + header_checks FILTER to override the transport_maps randmap warmup rotation (verified: mail routes via rehab transports, status=sent). Daily cron pw-ip-rehab. After ~2-3 weeks of clean sending the IPs can be reallocated.	2026-06-09 20:27:47 -05:00
justin	9fa2c86f01	fix(warmup): HC cron logged to /var/log (deploy can't write) -> cron silently died The HC warmup builder ran from cron at 07:00 but the >> /var/log/pw-hc-campaign.log redirect failed (deploy user cannot write /var/log), and a failed output redirect makes cron abort the command BEFORE it runs -> HC sent 0/day since the log file was removed. Route HC cron logs to /opt/performancewest/logs/ (deploy-owned) so the redirect always succeeds. Builder itself was fine (verified: imports + sends work, 0 bounces). Also removed the stale 'campaign-warmup.sh 122' root-cron line that pointed at a finished campaign + no longer existed.	2026-06-09 16:06:28 -05:00
justin	9b9d317916	infra/k8s: shkeeper liveness+readiness probes (fix recurring crypto.performancewest.net downtime) crypto.performancewest.net kept going down because the shkeeper-deployment web pod periodically HANGS (HTTP server deadlocks while the apscheduler background thread keeps the process alive). The helm chart (shkeeper-1.7.15) ships NO liveness or readiness probe, so k8s saw the hung pod as Running and never restarted it, and kept routing traffic to the dead backend -> site down until a manual restart. Added HTTP probes on / :5000 (302 = healthy): liveness auto-restarts a hung pod, readiness pulls it from the Service endpoints. Applied live via kubectl patch (chart does not expose probes via values; re-apply after any helm upgrade -- command in the file header). Verified: new pod comes up READY 1/1 (probe passes) and crypto.performancewest.net serves 302 again.	2026-06-09 04:57:50 -05:00
justin	7c39a858cc	monitoring: daily warmup IP-reputation Telegram alert End-of-day (20:00 Central) check of campaign deliverability across both sending pools (main out05-09 + healthcare hcout). Sends a Telegram alert ONLY when there is a reputation problem -- delivery below 65% or a spam/policy-block (550-5.7.1) spike above 150/day -- so healthy days stay silent. Reuses the existing TELEGRAM_BOT_TOKEN/CHAT_ID from /opt/performancewest/.env. Logs every run to /var/log/pw-warmup-healthcheck.log for history. Excludes internal/probe noise so the delivery figure reflects real external recipients.	2026-06-08 21:06:41 -05:00
justin	2156a5e05f	hc refresh: run Mon/Wed/Fri instead of weekly to shrink CMS data-lag The 'already revalidated' replies come from the CMS data-lag window (a provider completes their revalidation but CMS's public Due Date List still shows them overdue for weeks). Running the refresh 3x/week instead of weekly shrinks that window from up to 7 days to ~2-3, so a provider who just completed stops being targeted faster. No change to the overdue window or audience size -- this is the lever that reduces stale-data complaints without losing prospects.	2026-06-08 10:53:36 -05:00
justin	9cb10b18e0	feat(hc): deliverability prune -- evict newly-Google-hosted subscribers Belt-and-suspenders for the edge you flagged: a domain already in a warmup list could flip its MX to Google Workspace between weekly refreshes, after which it would hard-bounce from the cold IP. The import-time guard only catches NEW adds. - prune_holdouts(): enumerates each warmup list's subscribers, matches them against the FRESH master CSV (re-classified weekly), and removes any whose domain is now Google-hosted. DELIVERABILITY-ONLY -- it never evicts for audience reasons (an overdue provider drifting out of the 1-90 day window was a valid target when warmed; re-litigating that just wastes warmup progress). - --prune (run alongside warming) and --prune-only (prune then exit). - Wired into the weekly refresh cron as a --prune-only chained step, so MX is re-checked and holdouts removed every Monday before the weekday sends. Verified end-to-end: with no Google domains in lists it's a 0-op; injecting a simulated Google-flipped domain into the master, the prune correctly detects and (in a real run) would remove it from every list it's on.	2026-06-08 03:39:56 -05:00
justin	feb677f6ce	fix(hc warmup): only mail slightly-overdue providers (deliverability) Mailing heavily-overdue NPIs (months/years past due) risks hitting practices that have closed, merged, or abandoned the inbox -> hard bounces, which are the fastest way to wreck a warming IP's reputation. The warmup now restricts the reval_overdue selector to an inclusive [HC_OVERDUE_MIN, HC_OVERDUE_MAX] window (default 1-90 days) and the OIG 'any' selector likewise excludes heavily-overdue and dropped-off-list rows. On the current cohort this trims the overdue audience 178->96 and the OIG audience 399->317, holding out the stale long tail (181-365d + 366d+). upcoming/active providers are unaffected.	2026-06-08 03:27:22 -05:00
justin	167c4a3847	infra/cron: multi-segment hc warmup + weekly data-refresh cron Tracks the deployed cron.d files in the repo: - pw-hc-campaign: updated comment to reflect the now multi-segment warmup (revalidation + OIG + NPPES + reactivation + bundle); command unchanged. - pw-hc-refresh (NEW): Mon 06:00 Central weekly data refresh, ~1h before the 07:00 weekday send, so every send uses fresh CMS/OIG status.	2026-06-08 03:15:47 -05:00
justin	138fec17e9	healthcare: daily batched paper-filing fulfillment Standard (no-login) CMS filings are mailed in one Priority Mail envelope per destination agency, batched each postal working-day morning to save postage. - migration 089: paper_filing_batches table + esign_records.paper_batch_id / filing_destination_key (idempotent: a filing is batched at most once). - batch_cover_sheet.py: per-agency cover sheet (sender/dest/date/manifest) + merged print-job PDF (cover + all enclosed signed filings). - daily_paper_batch.py worker: gather signed+unbatched cms855/cms10114 filings, group by destination (MAC by state via mac_routing; Fargo for CMS-10114), build cover+merged PDF per agency, persist batch, mark filings batched. Self-gates on postal working days (skips weekends + federal/USPS holidays). Phase 1 = human prints+mails; phase 2 = wire print-mail API. - worker-crons: pw-paper-batch systemd timer (Mon-Fri 13:30 UTC, self-gated). - test_paper_batch.py: 15/15 pass (working-day gating, routing, cover+merge).	2026-06-07 00:30:01 -05:00
justin	bf4e8c2277	infra: MTA-STS HTTPS vhost (cert issued, policy live)	2026-06-06 21:03:30 -05:00
justin	34daa0c1d3	infra: MTA-STS status note - cert pending stable HE.net DNS propagation	2026-06-06 19:37:37 -05:00
justin	7bd2f70de4	infra: MTA-STS policy + vhost + README (cert pending DNS propagation)	2026-06-06 19:36:27 -05:00
justin	4233c90a4f	hc email: reframe value-add to 'No 2FA. No government portals.' (we have a portal; the pain is CMS 2FA/identity-proofing); cron creates fresh dated campaign when prior is finished; add hc bounce watcher (Postfix->listmonk-hc webhook, hard/complaint->blocklist)	2026-06-06 16:47:12 -05:00
justin	6738a335af	infra: nginx vhost for listmonk-hc admin portal (lists-hc.performancewest.net -> 127.0.0.1:9101, LE cert)	2026-06-06 07:02:50 -05:00
justin	95698852ce	healthcare warmup: gate Google/Workspace domains out of week 1 (they hard-reject cold IPs 550-5.7.1); send 501 non-Google practice domains first, defer 222 Google to week 2-3; cron uses hc_warmup_nongoogle.csv	2026-06-06 04:02:00 -05:00
justin	2bc86268f7	healthcare: HC warmup campaign cron (Mon-Fri 7AM Central) - imports overdue-first verified slice into listmonk-hc + runs Medicare-revalidation campaign via hc HOT stream; rate-throttled by pw-hc-rampcap	2026-06-06 03:57:08 -05:00
justin	695c3e2431	security: drop all CBC TLS suites (Qualys WEAK -> AEAD-only, still A+); sync ansible nginx templates (ciphers + ywxi CSP); capture host firewall as IaC	2026-06-06 00:49:21 -05:00
justin	90d8b94f3f	feat(email): wire listmonk-hc into deploy + dev override + hc ramp-cap - deploy.sh/deploy-dev.sh: bring up listmonk-hc (upstream image, excluded from build); document the one-time listmonk_hc DB create + --install. - docker-compose.dev.override.yml: dev-only override (committed) that drops the prod host-port bindings and pins dev's own postgres volume (dev-pgdata) via compose !override tags. deploy-dev ships it as docker-compose.override.yml so syncing the canonical compose to the shared host no longer breaks dev's api-postgres (port :5432 clash + volume switch). Discovered + fixed while validating listmonk-hc on dev. - pw-hc-rampcap.sh: healthcare analogue of pw-listmonk-rampcap, ramps the listmonk_hc cap 100->1000/h off /etc/postfix/hc-warmup-start, fully independent of the trucking ramp/cap.	2026-06-05 19:19:45 -05:00
justin	70d742df08	feat(mta): healthcare HOT-stream Postfix setup (dedicated hc IPs, isolated) Adds 3 hc submission ports (2526/2527/2528) in the single Postfix instance, each content_filter'd onto a dedicated hc transport (hcout1/2/3) binding the hc IPs .107/.108/.109 with hc HELO identity (hcmta01-03) and hotter concurrency. listmonk-hc round-robins the 3 ports. Discovered + documented the constraint that drove this shape: transport_maps randmap is owned by the shared trivial-rewrite(8) and is global, so neither a per-smtpd -o transport_maps nor a FILTER randmap:{...} can scope a separate IP pool (FILTER parses randmap as a literal transport). content_filter=hcoutN: (empty nexthop) overrides transport_maps and keeps the real recipient domain. Verified end-to-end on the server: :2527 -> hcout2 (.108) -> real gmail MX; trucking transport_maps (.94-.96) untouched. Idempotent, postfix-check gated with auto-rollback.	2026-06-05 19:07:02 -05:00
justin	a79d6b1906	feat(healthcare): add gost proxy-relay so Chromium can use the residential proxy Chromium rejects authenticated SOCKS5 ('Browser does not support socks5 proxy authentication'). Add a gost (ginuerzh/gost:2.11.5) 'proxy-relay' sidecar that listens unauthenticated on socks5://proxy-relay:11080 and forwards to the authenticated residential upstream (HEALTHCARE_PROXY_UPSTREAM_URL). Workers point Playwright at the relay via HEALTHCARE_PROXY_URL=socks5://proxy-relay:11080. env template: split into HEALTHCARE_PROXY_UPSTREAM_URL (authenticated, password percent-encoded so '#' -> %23) and HEALTHCARE_PROXY_URL (the relay address). Validated end-to-end on dev: workers Chromium -> proxy-relay -> residential egress IP 76.228.206.147; NPPES + PECOS both HTTP 200.	2026-06-05 18:39:26 -05:00
justin	17318f6e7d	feat(healthcare): route NPPES/PECOS Playwright flows through residential SOCKS proxy CMS healthcare portals (NPPES, PECOS, I&A) block datacenter IPs, so the healthcare browser automation needs to egress via the residential proxy on hg409y7ez04.sn.mynetname.net (username 'performancewest'). - undetected_browser: use_proxy now accepts an env-var name, so callers can select a domain-specific proxy. _proxy_config(proxy_env) reads it and falls back to UNDETECTED_PROXY_URL. Healthcare uses 'HEALTHCARE_PROXY_URL'. - probe_npi_undetected: launches with use_proxy='HEALTHCARE_PROXY_URL' when set. - npi_provider: documents that the (future) automated NPPES/PECOS flows must use the healthcare proxy. - Plumb HEALTHCARE_PROXY_URL (+ UNDETECTED_PROXY_URL fallback) through the ansible env template and docker-compose workers env. The credential itself is NOT in the repo. Set the full URL in the ansible vault as vault_healthcare_proxy_url: socks5://performancewest:<password>@hg409y7ez04.sn.mynetname.net:<port> Verified parsing + Playwright proxy-dict wiring with a unit test.	2026-06-05 14:36:01 -05:00
justin	c027d49f43	Fix trucking campaign cron send date	2026-06-04 03:19:35 -05:00
justin	b48fc3a406	Retire burned MTA IPs in warmup script	2026-06-03 23:37:27 -05:00
justin	5c35140a22	Configure trucking deficiency campaign cron env	2026-06-03 23:04:41 -05:00
justin	6d4c323ab6	feat: daily intake-reminder worker for paid orders with incomplete intake Adds a systemd-timed worker that nudges customers who paid but never completed their intake form (which stalls fulfillment). - migration 087: intake_reminder_count + intake_reminder_last_at on compliance_orders (makes the daily run idempotent and bounded), plus a partial index for the paid-order eligibility scan. - scripts/workers/intake_reminder.py: each run emails any paid order with intake_data_validated != TRUE, capped at 10 reminders/order, at most one consolidated email per customer per day (groups a customer's incomplete services into one email). Reuses the post-payment intake URL format (/order/{slug}?order={n}) and the API's email validation, skipping placeholder/invalid addresses (synthetic@, pipeline.com, etc.). Sends via smtplib with SMTP_PASS (verified working in the worker container). - worker-crons: pw-intake-reminder timer, daily ~noon ET (16:00 UTC).	2026-06-03 00:20:37 -05:00
justin	2b13c36c93	ansible: sync portal nginx template with live working config The pw-portal-tls.conf.j2 template was stale (basic 47-line version) while the live /etc/nginx/sites-enabled/pw-portal.conf was hand-maintained with branding, /assets/ and /files/ serving. A future ansible run would have clobbered the working config. Sync the template to the live config (templatized) and document why /files/ must be served from /opt/erpnext-assets, not the docker volume.	2026-06-02 22:20:08 -05:00
justin	2fab98c0a8	postfix: multi-IP warmup sending pool (20 IPs, gradual rotation) - 20 IPs (.90-.109 / mta01-mta20) with FCrDNS + SPF in HestiaCP - .90 (mta01) dedicated Yahoo/AOL recovery IP (yahooslow, 20s trickle) - .91-.109 (out02-out20) rotation pool via transport_maps randmap - pw-mta-warmup: cron-driven scheduler grows the active rotation pool 3 -> 5 -> 8 -> 12 -> 16 -> 19 IPs over ~25 days - mta_setup.sh: idempotent installer (backups + postfix-check-gated reload) New IPs verified clean on Spamhaus/Barracuda/SpamCop/SORBS. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-31 19:03:30 -05:00
justin	0b7a35a58e	trucking campaigns: daily builder + MX verifier concurrency + tracking column - build_trucking_campaigns.py: nightly script that creates 8 Listmonk campaigns per day (4 TZ x 2 types: MCS-150 overdue 2k/TZ, inactive USDOT 1k/TZ) at 4AM ET / 5AM ET (CT) / 6AM ET (MT) / 7AM ET (PT). Deduplicates via listmonk_sent_at column. - migration 083: add listmonk_sent_at + listmonk_campaign_type to fmcsa_carriers - email_verifier.py: bump max_workers from 5 to 20 for 4x faster throughput - cron: daily pw-trucking-campaigns at 08:00 UTC (3 AM EST) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-31 10:07:44 -05:00
justin	e0ba8acc90	add pipeline orchestrator, mailbox 1583 flow, EIN + virtual-mailbox services - Pipeline orchestrator: chains sequential fulfillment for new carrier bundles (formation → EIN → USDOT → MC → BOC-3 → MCS-150 → D&A → UCR) - Mailbox setup: Anytime Mailbox provisioning with USPS 1583 e-sign + online notarization - New services: ein-application ($79), virtual-mailbox ($149/yr) - Registered all new handlers in SERVICE_HANDLERS - Pipeline cron: every 5 minutes	2026-05-30 22:56:54 -05:00
justin	479f3dfc45	add entity upgrade bundle service + deploy completion/IMAP crons Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-30 22:12:11 -05:00
justin	ad3d189b2b	post-completion flow: survey, referral program, review ask - Migration 081: referral_codes, referral_uses, exit_surveys tables - API: POST /api/v1/survey, POST /api/v1/referral/check, GET /api/v1/referral/:email - Worker: completion_emails.py — sends completion + 24h follow-up (survey + referral) - Survey page: /survey/?order=X&rating=N — star rating, feedback, Google review ask - Referral: REF-FIRSTNAME codes, $25 credit per referred order, no limit - Low ratings (1-3 stars) trigger Telegram alert for admin follow-up - Cron: every 15 minutes	2026-05-30 21:22:14 -05:00
justin	97dd08c821	Fix flagged items: CRTC email submission, BITS todo, selector docs, stale plans - CRTC letter now auto-emailed to secretary.general@crtc.gc.ca after eSign - BITS admin todo updated to reference electronic + physical submission - COLIN selectors.py: documented verification status per step - BC config: added CRTC Secretary General email address - plan.md: marked completed items (eSign, portal auth, CRTC email) - go-live-todo.md: marked Compliance Calendar DocType as imported Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-04 11:33:45 -05:00
justin	78c04b8bc3	Add Playwright failure monitoring: Telegram alerts + screenshots + health check When any Playwright submission fails (selector not found, timeout, etc.): 1. Full-page screenshot captured and uploaded to MinIO 2. Telegram alert sent immediately with error details + screenshot link 3. Email alert to ops with same info 4. Admin todo includes screenshot MinIO path for debugging 5. Client order stays pending for manual completion Proactive selector health check (daily 7am CT cron): - Navigates to each portal (FCC RMD, USAC E-File, FCC CPNI/ECFS) - Verifies all critical selectors are still present in the DOM - If selectors are missing (UI changed): alerts via Telegram + email BEFORE any real client order fails - Reports which service slugs are affected Integrated into: - RMD filing handler (fccprod.servicenowservices.com) - Form 499-A handler (forms.universalservice.org) - Form 499-Q handler (already had error handling) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-04 02:44:02 -05:00
justin	572f0cbf93	Implement 499-Q quarterly filing lifecycle After 499-A+Q bundle is filed, the handler now creates actual compliance_orders for each remaining quarterly 499-Q filing: Schedule: Q1 due Feb 1, Q2 due May 1, Q3 due Aug 1, Q4 due Nov 1 Each quarterly order: - Created as paid (covered by bundle price) - Has due_date, quarter, period_end_date in intake_data - Links to parent 499-A order - Tracks reminder status (30d/14d/7d sent flags) Notification worker (quarterly_499q_notify.py): - Runs daily at 8am CT via systemd timer - Sends HTML reminder emails at 30, 14, 7 days before due - Email includes intake link for client to submit quarterly data - Late warning at 7 days: "USAC may estimate higher contributions" - Idempotent: won't re-send same reminder level Added fcc-499q service slug ($0, not sold standalone). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-03 02:28:04 -05:00
justin	a4a5500bfc	Add Prometheus + Grafana + Alertmanager monitoring stack Full observability stack with Telegram alerting: Components: - Prometheus: metrics collection, 90-day retention - Grafana: dashboards at monitoring.performancewest.net - Alertmanager: routes alerts to Telegram bot - node-exporter: OS metrics (CPU, RAM, disk, network) - cAdvisor: container metrics (CPU, memory, restarts) - postgres-exporter: PostgreSQL connection/query metrics - nginx-exporter: request rate, 5xx errors, connections - blackbox-exporter: HTTP/TCP endpoint probing + SSL cert checks Alert rules: - Service down (HTTP probe, TCP port, container missing) - Container restart loops - High CPU/memory/disk/load - PostgreSQL down or high connections - SSL cert expiring (14d warning, 3d critical) - Slow HTTP responses, high 5xx rate Blackbox probes all public endpoints: performancewest.net, api, dev, crm, lists, analytics, minio, crypto, pay Telegram alerts: critical=1h repeat, warning=6h repeat, auto-resolve notifications Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-01 02:08:39 -05:00

1 2

52 commits