Commit graph

30 commits

Author SHA1 Message Date
justin
3ca960aca5 docs+infra(deliverability): document bulk subdomain; ansible signs send.performancewest.net
- infra/ansible/roles/mail: refactor OpenDKIM to support multiple signing domains
  via opendkim_signing_domains list (root + send.performancewest.net). Loops
  keygen/ownership/keytable/signingtable so the live two-domain setup is
  reproducible from ansible.
- infra/ansible group_vars: add bulk_mail_subdomain + campaign_from_* +
  campaign_reply_to documentation vars (map to CAMPAIGN_FROM / HC_CAMPAIGN_FROM
  env read by the builder scripts). smtp_from (transactional) stays on root.
- docs/deliverability.md: rewrite TL;DR with the carrierone-vs-performancewest
  A/B proof (same server/IPs, different From domain -> Inbox vs Junk) and the
  ~85% Microsoft / 14% Google / <1% Yahoo audience mix; add the bulk-subdomain
  section, SPF trim, rehab-disabled, and the Hestia DNS automation runbook.
2026-06-18 23:12:05 -05:00
justin
cf021e2f91 feat(healthcare): OIG/SAM exclusion screening as $79/mo Stripe Subscription
Convert OIG/SAM from one-time $299/yr to recurring $79/month (card+ACH only) -
the first real recurring-billing product in the system. Exclusion screening is
a *monthly* federal obligation, so recurring monitoring fits the requirement and
is the biggest valuation lever (vs a one-time annual run).

Catalog (single source of truth):
- service-catalog.ts: add billing_interval + allowed_methods to ComplianceService;
  oig-sam-screening -> 7900c, billing_interval:"month", allowed_methods:[card,ach],
  name "(Monthly Monitoring)".
- gen-service-catalog.py + check-service-catalog-drift.py: carry/guard the two new
  fields; regenerate site catalog.

Checkout (api/src/routes/checkout.ts):
- mode:"subscription" with recurring price_data when billing_interval is set;
  surcharge absorbed for recurring (clean $79/mo); server-side METHOD_NOT_ALLOWED
  re-validation against allowed_methods.
- ensureColumns + migration 100: compliance_orders.stripe_subscription_id,
  bundle_upsell_sent_at (+ subscription index).

Webhooks (api/src/routes/webhooks.ts):
- record stripe_subscription_id on checkout.session.completed (subscription mode).
- invoice.paid (subscription_cycle only) -> re-dispatch screening for the cycle;
  invoice.payment_failed -> admin alert + first-failure customer nudge;
  customer.subscription.deleted -> mark order cancelled. (API 2026-03-25 moved the
  subscription link to invoice.parent.subscription_details.subscription.)

Fulfillment:
- job_server.py: pass recurring_cycle/invoice_id into the order.
- npi_provider.py: OIG handler labels renewal cycles "[Monthly cycle]" + re-screen
  note; bundle action runs only the FIRST screening + flags the $79/mo upsell.

Bundle land-and-expand:
- Provider Compliance Bundle now includes only the first OIG/SAM screening (was
  giving away $948/yr of monitoring inside an $899 bundle).
- new worker scripts/workers/bundle_upsell.py (+ pw-bundle-upsell timer): ~3 weeks
  after a paid bundle, emails the customer to continue $79/mo monitoring; dedup via
  bundle_upsell_sent_at; skips customers who already have an OIG/SAM order.

Surfaces updated to $79/mo: PaymentStep (filters methods, "Billed every month,
cancel anytime"), order pages, healthcare index, npi-compliance-check tool (also
fixed stale $699 bundle drift -> $899), hc_oig_screening + hc_compliance_bundle
emails.

Docs: billing.md gains a "Stripe-native Subscriptions" section + a reality-check
banner (Adyen/ERPNext-gateway model documented there is NOT live; Stripe is the
real rail). Fixed run-migrations.yml container name bug
(performancewest-postgres-1 -> performancewest-api-postgres-1, overridable).

Tests: api/tests/recurring-subscription.test.ts (28 assertions) covers catalog
gating, method validation, surcharge suppression, recurring line-item build,
invoiceSubscriptionId extraction, renewal-cycle gating. tsc clean; site build
clean; catalog drift OK.

Manual deploy step: enable invoice.paid, invoice.payment_failed,
customer.subscription.deleted on the Stripe webhook endpoint.
2026-06-18 07:54:38 -05:00
justin
a04ecf7df3 chore(email): decommission SMTP2GO references — local MTA only
SMTP2GO is no longer used: Listmonk relays through the local Postfix MTA
(172.18.0.1:25 from the Docker network), which DKIM-signs and delivers
direct-to-recipient-MX; transactional mail goes through Carbonio. Verified
zero smtp2go in any live container env + postfix has no external relayhost.

Removed the stale references so a rebuild/new dev can't re-introduce it:
- api/src/config.ts: SMTP_HOST default mail.smtp2go.com -> co.carrierone.com
- scripts/workers/crypto_payment_worker.py: same default fix
- infra/ansible all.yml: listmonk_smtp_* now 172.18.0.1:25, no auth (+comment)
- app.env.j2 / email.ts / crm.md / go-live-todo.md / architecture.svg: docs
2026-06-17 22:46:59 -05:00
justin
899b880e7f trucking: weekly FMCSA source refresh so new non-compliant carriers are caught
The FMCSA census was a one-time snapshot (last loaded ~May 30) with NO refresh
timer -- carriers newly falling out of MCS-150/UCR compliance were never picked
up. New scripts/workers/fmcsa_source_refresh.py orchestrates the full pipeline
(census download -> enrichment -> deficiency flag -> verify new emails ->
MX-tag new) and runs weekly via cron pw-fmcsa-refresh (Sun 09:00 UTC), codified
in the mail-pipeline Ansible role.

Idempotent + incremental: the census upsert preserves email_verified /
listmonk_sent_at / deficiency_flags, so existing carriers keep their send state
and only census fields refresh; new DOTs flow into verification then campaigns.
A carrier who refiled gets a fresh mcs150_parsed, so the builder's overdue
WHERE clause stops targeting them automatically. Verify is capped per run
(20k) so it never stalls on millions of rows.

(Healthcare already auto-catches newly-revalidation-overdue providers within
its 63k institutional pool via pw-hc-refresh Mon/Wed/Fri.)
2026-06-17 20:44:54 -05:00
justin
4dc5690666 infra: codify the email-campaign pipeline in Ansible (new mail-pipeline role)
The entire outbound campaign pipeline lived ONLY on the host and was never in
IaC -- a fresh rebuild would have silently shipped NO campaigns, NO IP warmup/
ramp, and NO bounce processing. New mail-pipeline role + deploy-mail-pipeline.yml
playbook deploy it from the canonical repo copies:

  cron.d (infra/cron/):
    - pw-trucking-campaign-builder, pw-ifta-campaign, pw-ucr-campaign
    - pw-hc-campaign, pw-hc-nppes, pw-hc-refresh
    - pw-mta-warmup, pw-listmonk-rampcap, pw-hc-rampcap
    - pw-ip-rehab, pw-warmup-tg-alert
  helper scripts (-> /usr/local/bin):
    - pw-mta-warmup, pw-listmonk-rampcap, pw-hc-rampcap, pw-warmup-tg-alert
    - postfix-bounce-notify.sh, postfix-hc-bounce-notify.sh, listmonk-bounce-sync.py
  systemd services:
    - pw-bounce-watcher.service (was missing from repo), pw-hc-bounce-watcher.service

Also creates the deploy-owned {{project_dir}}/logs dir (deploy can't write
/var/log, so a missing dir made cron redirects fail). Added the 6 cron.d files
that existed only on the host, the trucking bounce-watcher unit, and synced
infra/cron/pw-hc-refresh to the live version (revalidation download + enrich
steps). Role wired into site.yml after the mail (OpenDKIM) role.

Part of the email-deliverability incident hardening.
2026-06-17 20:26:01 -05:00
justin
2e4388a803 mail: add logrotate for Postfix mail.log (postlogd copytruncate)
mail.log had no logrotate rule and grew unbounded to ~1GB (~150MB/day)
since Jun 8. This host logs via Postfix's built-in postlogd (maillog_file
mode), not rsyslog (no rsyslog.service exists), so postlogd holds the file
open -- a plain rename+create would leave it writing to the stale inode.
Use copytruncate (no daemon signal needed). Rotate daily, keep 14 days
compressed. Applied live: forced first rotation, compressed the 1GB
archive (->99MB), verified logging + bounce watchers + DKIM signing intact.

Part of the email-deliverability incident hardening (follows DKIM fix 4d59019).
2026-06-17 19:47:13 -05:00
justin
4d5901921e mail: fix OpenDKIM not signing campaign mail (Docker-injected) + codify in Ansible
Root cause of the Jun 2026 deliverability collapse / 'no new sales':
opendkim.conf was in single-key mode with no InternalHosts, so it signed only
127.0.0.1. Transactional/cron mail (injected locally) was signed, but ALL
campaign mail -- injected over the Docker bridge from the Listmonk containers
(172.18.0.5 trucking, 172.18.0.25 healthcare) -- went out UNSIGNED. Gmail/Yahoo
require DKIM on bulk mail since Feb 2024, so cold campaigns were junked/blocked
(~23% delivery, 550-5.7.1). Proof: 2,620 campaign msgs that day, 0 DKIM sigs.

The correct table files already existed on the server but were never wired into
opendkim.conf. Fix points the daemon at key.table/signing.table and sets
InternalHosts/ExternalIgnoreList to trusted.hosts (which includes 172.16.0.0/12,
the Docker subnet). Fixes BOTH streams: HC submission ports 2526-2528 inherit
the global smtpd_milters and *@performancewest.net covers compliance@.

Verified by injecting from a Docker IP through port 25 and port 2526 -- both now
get 'DKIM-Signature field added'. Codified as new Ansible role 'mail' so it
can't silently regress (OpenDKIM was previously not in IaC at all).
2026-06-17 19:31:19 -05:00
justin
01b3e1d234 chore(env): scaffold ISA_SC_DMS_USER/PASS for SC PSC MyDMS e-file portal
Non-attorney 'Service' filer account registered under Performance West
(filings@performancewest.net). Credentials live only in the server .env
(blank default in template, never committed). Consumed by the upcoming SC
intrastate Playwright e-filer.
2026-06-16 08:19:17 -05:00
justin
c27cfd3242 docs(crons): note IRP invoice poller now also handles intrastate [PW-ISA] replies 2026-06-16 07:59:38 -05:00
justin
b125d46663 feat(intrastate): automate state PUC/PSC authority filing (email + invoice + auto-bill)
Intrastate operating authority is state-specific + application-based like IRP, so
it reuses the same email/POA + invoice-reconciliation flow:
  - intrastate_filing.send_intrastate_submission: emails the state PSC/PUC the
    authority application with the signed POA attached (subject tag [PW-ISA CO-..]),
    reusing irp_filing's MinIO download + census enrich helpers.
  - The shared poller (irp_invoice_poller) now matches BOTH [PW-IRP] and [PW-ISA]
    tags, parses the fee, Telegram-alerts, and bills the customer the exact amount
    with the correct service slug.
  - state_trucking gov-fee gate routes intrastate-authority to the PSC/PUC email
    path; if no submission email is configured for the base state it falls back
    to a manual todo (safe default — no emailing guessed agency addresses).

Per-state ISA_<ST>_EMAIL env (blank until the exact agency address is verified).
SC/GA/TX scaffolded. Customer still only sees an exact-fee payment link; you only
approve the final filing.
2026-06-16 07:57:57 -05:00
justin
ea695d6828 feat(govfee): exact fees + agency processing fees; IRP email/invoice reconciliation
- gov_fee: add AGENCY_PROCESSING_FEE (per-service card/convenience fee passed
  through so the customer pays the true all-in cost); estimate_gov_fee now folds
  it into the billed total. IFTA/intrastate/UCR fees are published/near-exact.

- IRP fees can't be looked up — only the base state computes them. New
  irp_filing.py: emails the base-state IRP unit a Schedule A/B request (Reply-To
  the IRP filings mailbox, [PW-IRP CO-...] subject tag), and a 15-min cron
  (irp_invoice_poller) scans the mailbox for the state's invoice reply, parses
  the exact apportioned fee, Telegram-alerts you, and bills the customer the
  EXACT amount via a gov-fee child order + payment link. Then it proceeds to
  ready_to_file for your final approval.

- state_trucking gov-fee gate now routes IRP to the email/invoice path and
  IFTA/intrastate to immediate exact-fee billing.

- Mailbox is configurable (IRP_FILINGS_IMAP_* in app.env.j2); falls back to
  OPS_IMAP_* filtered by the [PW-IRP] tag until a dedicated mailbox exists.

Telegram alerts fire on IRP submission sent, invoice received (billed), and
un-parseable replies (so you can read + enter the fee manually).
2026-06-16 04:58:14 -05:00
justin
d65f5ea279 nginx: stop blocking /admin (bot-scan rule matched our own dashboard)
The shared security snippet blocked any path matching /(admin|administrator|
login.action|struts) with 'return 444', which drops the connection. That bare
'admin' token also matched our own operations dashboard at /admin and the new
/admin/compliance-orders, so the browser showed 'This site can't be reached'.
Dropped the bare 'admin' token; administrator/login.action/struts stay blocked.
Applied live on prod (sudo edit + nginx reload); this updates the source of
truth so the ansible nginx role won't reintroduce it.
2026-06-16 00:05:54 -05:00
justin
138fec17e9 healthcare: daily batched paper-filing fulfillment
Standard (no-login) CMS filings are mailed in one Priority Mail envelope per
destination agency, batched each postal working-day morning to save postage.

- migration 089: paper_filing_batches table + esign_records.paper_batch_id /
  filing_destination_key (idempotent: a filing is batched at most once).
- batch_cover_sheet.py: per-agency cover sheet (sender/dest/date/manifest) +
  merged print-job PDF (cover + all enclosed signed filings).
- daily_paper_batch.py worker: gather signed+unbatched cms855/cms10114 filings,
  group by destination (MAC by state via mac_routing; Fargo for CMS-10114),
  build cover+merged PDF per agency, persist batch, mark filings batched.
  Self-gates on postal working days (skips weekends + federal/USPS holidays).
  Phase 1 = human prints+mails; phase 2 = wire print-mail API.
- worker-crons: pw-paper-batch systemd timer (Mon-Fri 13:30 UTC, self-gated).
- test_paper_batch.py: 15/15 pass (working-day gating, routing, cover+merge).
2026-06-07 00:30:01 -05:00
justin
695c3e2431 security: drop all CBC TLS suites (Qualys WEAK -> AEAD-only, still A+); sync ansible nginx templates (ciphers + ywxi CSP); capture host firewall as IaC 2026-06-06 00:49:21 -05:00
justin
a79d6b1906 feat(healthcare): add gost proxy-relay so Chromium can use the residential proxy
Chromium rejects authenticated SOCKS5 ('Browser does not support socks5 proxy
authentication'). Add a gost (ginuerzh/gost:2.11.5) 'proxy-relay' sidecar that
listens unauthenticated on socks5://proxy-relay:11080 and forwards to the
authenticated residential upstream (HEALTHCARE_PROXY_UPSTREAM_URL). Workers point
Playwright at the relay via HEALTHCARE_PROXY_URL=socks5://proxy-relay:11080.

env template: split into HEALTHCARE_PROXY_UPSTREAM_URL (authenticated, password
percent-encoded so '#' -> %23) and HEALTHCARE_PROXY_URL (the relay address).

Validated end-to-end on dev: workers Chromium -> proxy-relay -> residential
egress IP 76.228.206.147; NPPES + PECOS both HTTP 200.
2026-06-05 18:39:26 -05:00
justin
17318f6e7d feat(healthcare): route NPPES/PECOS Playwright flows through residential SOCKS proxy
CMS healthcare portals (NPPES, PECOS, I&A) block datacenter IPs, so the
healthcare browser automation needs to egress via the residential proxy on
hg409y7ez04.sn.mynetname.net (username 'performancewest').

- undetected_browser: use_proxy now accepts an env-var name, so callers can
  select a domain-specific proxy. _proxy_config(proxy_env) reads it and falls
  back to UNDETECTED_PROXY_URL. Healthcare uses 'HEALTHCARE_PROXY_URL'.
- probe_npi_undetected: launches with use_proxy='HEALTHCARE_PROXY_URL' when set.
- npi_provider: documents that the (future) automated NPPES/PECOS flows must
  use the healthcare proxy.
- Plumb HEALTHCARE_PROXY_URL (+ UNDETECTED_PROXY_URL fallback) through the
  ansible env template and docker-compose workers env.

The credential itself is NOT in the repo. Set the full URL in the ansible
vault as vault_healthcare_proxy_url:
  socks5://performancewest:<password>@hg409y7ez04.sn.mynetname.net:<port>
Verified parsing + Playwright proxy-dict wiring with a unit test.
2026-06-05 14:36:01 -05:00
justin
c027d49f43 Fix trucking campaign cron send date 2026-06-04 03:19:35 -05:00
justin
5c35140a22 Configure trucking deficiency campaign cron env 2026-06-03 23:04:41 -05:00
justin
6d4c323ab6 feat: daily intake-reminder worker for paid orders with incomplete intake
Adds a systemd-timed worker that nudges customers who paid but never completed
their intake form (which stalls fulfillment).

- migration 087: intake_reminder_count + intake_reminder_last_at on
  compliance_orders (makes the daily run idempotent and bounded), plus a
  partial index for the paid-order eligibility scan.
- scripts/workers/intake_reminder.py: each run emails any paid order with
  intake_data_validated != TRUE, capped at 10 reminders/order, at most one
  consolidated email per customer per day (groups a customer's incomplete
  services into one email). Reuses the post-payment intake URL format
  (/order/{slug}?order={n}) and the API's email validation, skipping
  placeholder/invalid addresses (synthetic@, pipeline.com, etc.). Sends via
  smtplib with SMTP_PASS (verified working in the worker container).
- worker-crons: pw-intake-reminder timer, daily ~noon ET (16:00 UTC).
2026-06-03 00:20:37 -05:00
justin
2b13c36c93 ansible: sync portal nginx template with live working config
The pw-portal-tls.conf.j2 template was stale (basic 47-line version) while the
live /etc/nginx/sites-enabled/pw-portal.conf was hand-maintained with branding,
/assets/ and /files/ serving. A future ansible run would have clobbered the
working config. Sync the template to the live config (templatized) and document
why /files/ must be served from /opt/erpnext-assets, not the docker volume.
2026-06-02 22:20:08 -05:00
justin
0b7a35a58e trucking campaigns: daily builder + MX verifier concurrency + tracking column
- build_trucking_campaigns.py: nightly script that creates 8 Listmonk campaigns
  per day (4 TZ x 2 types: MCS-150 overdue 2k/TZ, inactive USDOT 1k/TZ)
  at 4AM ET / 5AM ET (CT) / 6AM ET (MT) / 7AM ET (PT). Deduplicates via
  listmonk_sent_at column.
- migration 083: add listmonk_sent_at + listmonk_campaign_type to fmcsa_carriers
- email_verifier.py: bump max_workers from 5 to 20 for 4x faster throughput
- cron: daily pw-trucking-campaigns at 08:00 UTC (3 AM EST)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-31 10:07:44 -05:00
justin
e0ba8acc90 add pipeline orchestrator, mailbox 1583 flow, EIN + virtual-mailbox services
- Pipeline orchestrator: chains sequential fulfillment for new carrier bundles
  (formation → EIN → USDOT → MC → BOC-3 → MCS-150 → D&A → UCR)
- Mailbox setup: Anytime Mailbox provisioning with USPS 1583 e-sign + online notarization
- New services: ein-application ($79), virtual-mailbox ($149/yr)
- Registered all new handlers in SERVICE_HANDLERS
- Pipeline cron: every 5 minutes
2026-05-30 22:56:54 -05:00
justin
479f3dfc45 add entity upgrade bundle service + deploy completion/IMAP crons
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-30 22:12:11 -05:00
justin
ad3d189b2b post-completion flow: survey, referral program, review ask
- Migration 081: referral_codes, referral_uses, exit_surveys tables
- API: POST /api/v1/survey, POST /api/v1/referral/check, GET /api/v1/referral/:email
- Worker: completion_emails.py — sends completion + 24h follow-up (survey + referral)
- Survey page: /survey/?order=X&rating=N — star rating, feedback, Google review ask
- Referral: REF-FIRSTNAME codes, $25 credit per referred order, no limit
- Low ratings (1-3 stars) trigger Telegram alert for admin follow-up
- Cron: every 15 minutes
2026-05-30 21:22:14 -05:00
justin
97dd08c821 Fix flagged items: CRTC email submission, BITS todo, selector docs, stale plans
- CRTC letter now auto-emailed to secretary.general@crtc.gc.ca after eSign
- BITS admin todo updated to reference electronic + physical submission
- COLIN selectors.py: documented verification status per step
- BC config: added CRTC Secretary General email address
- plan.md: marked completed items (eSign, portal auth, CRTC email)
- go-live-todo.md: marked Compliance Calendar DocType as imported

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-04 11:33:45 -05:00
justin
78c04b8bc3 Add Playwright failure monitoring: Telegram alerts + screenshots + health check
When any Playwright submission fails (selector not found, timeout, etc.):
1. Full-page screenshot captured and uploaded to MinIO
2. Telegram alert sent immediately with error details + screenshot link
3. Email alert to ops with same info
4. Admin todo includes screenshot MinIO path for debugging
5. Client order stays pending for manual completion

Proactive selector health check (daily 7am CT cron):
- Navigates to each portal (FCC RMD, USAC E-File, FCC CPNI/ECFS)
- Verifies all critical selectors are still present in the DOM
- If selectors are missing (UI changed): alerts via Telegram + email
  BEFORE any real client order fails
- Reports which service slugs are affected

Integrated into:
- RMD filing handler (fccprod.servicenowservices.com)
- Form 499-A handler (forms.universalservice.org)
- Form 499-Q handler (already had error handling)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-04 02:44:02 -05:00
justin
572f0cbf93 Implement 499-Q quarterly filing lifecycle
After 499-A+Q bundle is filed, the handler now creates actual
compliance_orders for each remaining quarterly 499-Q filing:

Schedule: Q1 due Feb 1, Q2 due May 1, Q3 due Aug 1, Q4 due Nov 1

Each quarterly order:
- Created as paid (covered by bundle price)
- Has due_date, quarter, period_end_date in intake_data
- Links to parent 499-A order
- Tracks reminder status (30d/14d/7d sent flags)

Notification worker (quarterly_499q_notify.py):
- Runs daily at 8am CT via systemd timer
- Sends HTML reminder emails at 30, 14, 7 days before due
- Email includes intake link for client to submit quarterly data
- Late warning at 7 days: "USAC may estimate higher contributions"
- Idempotent: won't re-send same reminder level

Added fcc-499q service slug ($0, not sold standalone).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-03 02:28:04 -05:00
justin
a4a5500bfc Add Prometheus + Grafana + Alertmanager monitoring stack
Full observability stack with Telegram alerting:

Components:
- Prometheus: metrics collection, 90-day retention
- Grafana: dashboards at monitoring.performancewest.net
- Alertmanager: routes alerts to Telegram bot
- node-exporter: OS metrics (CPU, RAM, disk, network)
- cAdvisor: container metrics (CPU, memory, restarts)
- postgres-exporter: PostgreSQL connection/query metrics
- nginx-exporter: request rate, 5xx errors, connections
- blackbox-exporter: HTTP/TCP endpoint probing + SSL cert checks

Alert rules:
- Service down (HTTP probe, TCP port, container missing)
- Container restart loops
- High CPU/memory/disk/load
- PostgreSQL down or high connections
- SSL cert expiring (14d warning, 3d critical)
- Slow HTTP responses, high 5xx rate

Blackbox probes all public endpoints:
  performancewest.net, api, dev, crm, lists, analytics,
  minio, crypto, pay

Telegram alerts: critical=1h repeat, warning=6h repeat,
  auto-resolve notifications

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-01 02:08:39 -05:00
justin
97e8664cbf Add security-updates Ansible role for automated patching
Comprehensive security update automation:

1. Debian OS (unattended-upgrades) — tightened to security-only:
   - Removed general Debian updates (prevents feature/breaking changes)
   - Only Debian-Security origins auto-installed
   - Email admin on every upgrade via ops@performancewest.net
   - Auto-reboot at 4 AM if kernel update requires it
   - needrestart auto-restarts services after library updates

2. Docker CE — major version guard:
   - Patch updates within pinned major version auto-applied
   - Major version jumps held + admin alerted for manual review
   - docker-ce, docker-ce-cli, containerd.io all version-guarded

3. Container base images — daily at 3:30 AM:
   - Pulls latest base images for all docker-compose services
   - Compares image digests — only rebuilds if changed
   - Restarts only affected services (not full stack)
   - Alerts admin on rebuild failures requiring manual intervention
   - Covers both prod and dev compose projects

4. k3s — weekly Sunday at 3:45 AM:
   - Patch updates within current minor auto-applied
   - Minor/major upgrades alert admin for manual review
   - Verifies node Ready status after update
   - Alerts on failures with investigation instructions

5. Admin notifications via SMTP:
   - [INFO] for successful patches
   - [WARNING] for available major upgrades needing review
   - [CRITICAL] for failures requiring immediate intervention
   - Falls back to syslog if SMTP unavailable

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-30 01:24:57 -05:00
justin
f8cd37ac8c Initial commit — Performance West telecom compliance platform
Includes: API (Express/TypeScript), Astro site, Python workers,
document generators, FCC compliance tools, Canada CRTC formation,
Ansible infrastructure, and deployment scripts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-27 06:54:22 -05:00