Commit graph

96 commits

Author SHA1 Message Date
justin
f856434642 Fix service probes: correct endpoints and permissive HTTP module
- Workers: use http_internal module (HTTP/1.0 SimpleHTTPServer)
- ERPNext: use /api/method/ping, accept 401/403 (still means alive)
- Listmonk: use /health not /api/health (403 without auth)
- Forgejo: port 3000 not 3030
- Dev API: probe via HTTPS public URL (blackbox can't reach Docker)
- Added http_internal blackbox module accepting HTTP/1.0 + 401/403

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-01 03:33:48 -05:00
justin
2f9005693e Add deep service health monitoring for all PW dependencies
Each service gets its own Prometheus probe verifying actual functionality:
- API: /status endpoint (checks DB connectivity, returns 503 if down)
- Workers: /health endpoint (job server responsive)
- ERPNext: API method call (MariaDB + Redis + app all working)
- MinIO: /minio/health/live (storage accessible)
- Listmonk: /api/health (email service + DB)
- Ollama: root endpoint (LLM inference available)
- Umami: /api/heartbeat (analytics tracking)
- Forgejo: root page (git server accessible)
- PostgreSQL: pg_up metric from postgres-exporter
- All HTTPS endpoints: SSL + reachability from outside

Service-specific alerts with context:
- API down = DB may be unreachable
- Workers down = compliance orders not processing
- ERPNext down = CRM inaccessible
- MinIO down = document storage unavailable

Custom Grafana dashboard: "Performance West — Services Overview"
- Service status grid (UP/DOWN with colors)
- Response time charts (internal + HTTPS)
- SSL certificate expiry gauges
- Container CPU/memory per service
- PostgreSQL connections, nginx req/s, active alerts

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-01 03:30:23 -05:00
justin
cc463a662f Fix MinIO health probe: use internal Docker URL instead of public
MinIO returns 403 when accessed via minio.performancewest.net because
it interprets the Host header as a bucket name. Switch blackbox probe
to internal http://minio:9000/minio/health/live which works correctly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-01 03:26:46 -05:00
justin
0a31313956 Fix nginx-exporter: back to bridge network with host.docker.internal
host network mode prevented Prometheus from reaching the exporter.
Switched back to bridge with extra_hosts + explicit port mapping.
Added timeout flag to prevent hanging on stub_status fetch.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-01 03:21:27 -05:00
justin
433827138b Fix nginx-exporter: use host network mode for direct stub_status access
nginx-exporter couldn't reach host nginx via host.docker.internal
(connection timeout). Switch to network_mode: host so it can access
127.0.0.1:8888 directly. Prometheus scrapes via host.docker.internal
with extra_hosts mapping.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-01 03:19:57 -05:00
justin
27cc925c4d Fix nginx-exporter port and add alertmanager scrape target
- nginx stub_status moved to port 8888 (port 80 was being caught
  by other server blocks and returning 301)
- nginx-exporter updated to scrape :8888
- Added alertmanager scrape job to Prometheus config (was missing,
  so alertmanager dashboard had no data)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-01 03:17:31 -05:00
justin
b38b1af872 Disable Grafana brute force lockout during initial setup
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-01 03:11:30 -05:00
justin
b298ec12b7 Remove fixed uid from Grafana datasource provisioning — Grafana 13 rejects it on fresh boot
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-01 03:09:10 -05:00
justin
fc324cf7b9 Fix Grafana datasource UID to match dashboard references
Community dashboards reference datasource uid=prometheus but the
auto-generated UID was random. Pin to uid=prometheus for compatibility.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-01 03:07:03 -05:00
justin
a4a5500bfc Add Prometheus + Grafana + Alertmanager monitoring stack
Full observability stack with Telegram alerting:

Components:
- Prometheus: metrics collection, 90-day retention
- Grafana: dashboards at monitoring.performancewest.net
- Alertmanager: routes alerts to Telegram bot
- node-exporter: OS metrics (CPU, RAM, disk, network)
- cAdvisor: container metrics (CPU, memory, restarts)
- postgres-exporter: PostgreSQL connection/query metrics
- nginx-exporter: request rate, 5xx errors, connections
- blackbox-exporter: HTTP/TCP endpoint probing + SSL cert checks

Alert rules:
- Service down (HTTP probe, TCP port, container missing)
- Container restart loops
- High CPU/memory/disk/load
- PostgreSQL down or high connections
- SSL cert expiring (14d warning, 3d critical)
- Slow HTTP responses, high 5xx rate

Blackbox probes all public endpoints:
  performancewest.net, api, dev, crm, lists, analytics,
  minio, crypto, pay

Telegram alerts: critical=1h repeat, warning=6h repeat,
  auto-resolve notifications

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-01 02:08:39 -05:00
justin
97e8664cbf Add security-updates Ansible role for automated patching
Comprehensive security update automation:

1. Debian OS (unattended-upgrades) — tightened to security-only:
   - Removed general Debian updates (prevents feature/breaking changes)
   - Only Debian-Security origins auto-installed
   - Email admin on every upgrade via ops@performancewest.net
   - Auto-reboot at 4 AM if kernel update requires it
   - needrestart auto-restarts services after library updates

2. Docker CE — major version guard:
   - Patch updates within pinned major version auto-applied
   - Major version jumps held + admin alerted for manual review
   - docker-ce, docker-ce-cli, containerd.io all version-guarded

3. Container base images — daily at 3:30 AM:
   - Pulls latest base images for all docker-compose services
   - Compares image digests — only rebuilds if changed
   - Restarts only affected services (not full stack)
   - Alerts admin on rebuild failures requiring manual intervention
   - Covers both prod and dev compose projects

4. k3s — weekly Sunday at 3:45 AM:
   - Patch updates within current minor auto-applied
   - Minor/major upgrades alert admin for manual review
   - Verifies node Ready status after update
   - Alerts on failures with investigation instructions

5. Admin notifications via SMTP:
   - [INFO] for successful patches
   - [WARNING] for available major upgrades needing review
   - [CRITICAL] for failures requiring immediate intervention
   - Falls back to syslog if SMTP unavailable

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-30 01:24:57 -05:00
justin
611b8a9600 Validate Q1b and Q2 before proceeding to Step 2
Users could skip Q1b (customer type) and Q2 (voice delivery) and
hit Next — the wizard silently defaulted to retail. Now validates:
- Q1b must be answered (customer type selected)
- Q2 must be answered if voice is checked and not wholesale-only

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 11:46:11 -05:00
justin
fbf3b8a1ea Add terminate-only STIR/SHAKEN option across RMD pipeline
STIRShakenStep intake:
- New "Terminate only" option for carriers that only receive pre-signed
  calls and don't originate
- Contextual hints for each option explaining requirements
- Show/hide vendor and upstream fields based on selection

RMD letter generator:
- New terminate_only section explaining verification-only posture,
  citing 47 CFR § 64.6301 (signing obligation on originating provider)
- Added to needs_exhibit_a list

RMD Exhibit A generator:
- New terminate_only STIR/SHAKEN paragraph with SBC verification language
- Fixed scope paragraph: wholesale/facilities carriers no longer get
  "small provider without Class 4 switch" boilerplate
- Fixed OCN paragraph: wholesale carriers get neutral wording instead
  of "no OCN required for small retail provider"

RMD filing handler:
- Maps stir_shaken_status to rmd_option for Exhibit A generation
- Passes entity metadata (ocn, wholesale, gateway, contact) to generator
- Maps terminate_only → partial_implementation for FCC RMD form radio

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 10:59:28 -05:00
justin
050b19a43a Enable STIR/SHAKEN card in compliance checker with originate/terminate toggle
- Uncomment STIR/SHAKEN check in fcc-lookup.ts — shows self-reported
  implementation status from RMD filing
- Add toggle: "Do you originate calls or only terminate?"
  - Terminate only → green, signing cert not required, file RMD as
    partial implementation
  - Originate or both → red, must have own STI certificate as of
    June 2025
- Toggle integrates with pending-question system (CTA waits for answer)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 10:55:00 -05:00
justin
b02b5b4c1f Add STIR/SHAKEN originate vs terminate guidance in Q3
Wholesale providers that only receive/terminate pre-signed calls don't
need a STIR/SHAKEN signing certificate. Info box explains: originating
providers must sign with own cert (as of June 2025), but
terminating-only providers just verify signatures (software config)
and file RMD as "partial implementation."

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 10:52:18 -05:00
justin
3ea47b52ed Fix Q5 showing retail variant for wholesale carriers
Q5 was shown immediately on voice/broadband checkbox, before Q1b
(customer type) was answered — always defaulting to retail variant.
Now Q5 only appears after Q1b is answered, and the correct variant
(retail vs wholesale) is set at that point.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 10:48:53 -05:00
justin
653837617e Clarify non-interconnected VoIP with examples
Explain that non-interconnected VoIP means voice apps without phone
numbers (e.g. Microsoft Teams, Discord) to distinguish from
interconnected VoIP which connects to the PSTN.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 10:08:51 -05:00
justin
dec69ffc0e Add contact-us notice for non-standard service types
Non-interconnected VoIP, satellite, paging, private radio, and other
specialized services have different registration requirements. Show
a yellow info box under Q1 directing these users to contact us for
a custom assessment instead of using the wizard.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 10:02:02 -05:00
justin
337528b08a Split Q5 into retail vs wholesale variants
Retail carriers: ask where end-user customers are located, explain
state PUC nexus (registration triggered by customer location, not
incorporation). Wholesale carriers: simplified question about carrier
customer regions, explains they generally don't need state PUC
registration since the retail carrier holds the state obligation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 10:01:46 -05:00
justin
9c4d65c7a9 Skip Q2 voice delivery for wholesale-only carriers
Wholesale voice carriers run their own switching by definition — the
Q2 options (reseller, UCaaS, own switch) are retail delivery models.
When wholesale is selected in Q1b, Q2 is hidden and Q3 (infrastructure
needs: LCR, DIDs, interconnection, STIR/SHAKEN) is shown directly.
voiceDelivery is auto-set to own_switch for wholesale.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 09:59:03 -05:00
justin
3273a7020e Rewrite Q5 to ask where customers are located, explain state PUC nexus
State PUC registration is triggered by customer location, not
incorporation state. Telecom services use local infrastructure
(switches, numbers, towers) creating attributional nexus. Rewritten
Q5 explains this and provides guidance: ~27 states need full
certification, ~9 registration only, ~5 minimal. Shows PUC info
box for multi-state/nationwide selections.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 09:44:41 -05:00
justin
f6809730e5 Recommend Canada CRTC registration for international carriers
When carrier selects "Yes" to international services on Q6, show a
Canada CRTC recommendation box explaining: direct Canadian DIDs,
lower termination rates vs US gateway routing, CRTC interconnection.
Links to /order/canada-crtc with pricing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 09:39:03 -05:00
justin
eee3af0919 Add CALEA retail/wholesale toggle to FCC compliance checker
When CALEA SSI card shows red/amber, ask if carrier serves end users
(retail) or only other carriers (wholesale). Wholesale-only carriers
are exempt as "interconnecting carriers" under CALEA — card turns
green. Retail/both keeps the red status. Toggle integrates with
existing pending-question system (CTA waits for all toggles answered).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 09:38:35 -05:00
justin
c6863f7eae Add CALEA wholesale exemption, international Q6, Section 214 add-on
- CALEA: wholesale-only carriers exempt as "interconnecting carriers" per
  FCC rules. Only retail/both customer types trigger CALEA SSI requirement.
- New Q6 "Will you offer international services?" appears after Q5.
  Triggers International Section 214 Authorization add-on ($1,499).
- Section 214 info box explains when it's required.
- Customer type question (Q1b) is now visually separate from service
  type checkboxes to avoid confusion.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 09:35:27 -05:00
justin
790e980ef8 Separate customer type into its own question section
The retail/wholesale radios were visually mixed in with the voice/broadband
checkboxes, making it easy to misread "Wholesale" as a service type.
Moved to a distinct Q1b section "Who are your customers?" that only
appears after checking voice or broadband. Single selection covers
both services (retail / wholesale / both).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 09:33:42 -05:00
justin
95d4779660 Split retail/wholesale into per-service radios for voice and broadband
Voice and broadband can have independent customer models (e.g. wholesale
voice + retail broadband). Each service type now gets its own inline
retail/wholesale/both radio when checked. Derivation logic updated:
- Voice carriers always need RMD+CPNI regardless of mode
- BDC only required when broadband has retail end users
- CALEA still triggered by voice or facilities-based broadband

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 09:12:18 -05:00
justin
b473bf1783 Fix pricing calculation: remove feedback loop in updatePrice
wizard.addonFee was being subtracted and then recalculated each call,
causing prices to accumulate/subtract randomly on checkbox toggle.
Simplified to just sum base + checked addons + formation fees.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 08:55:21 -05:00
justin
2927b5cebb Add FCC Carrier/ISP Registration: API, checkout, handler, dispatch
Phase 3-5:
- API: POST /api/v1/fcc-carrier-registration (order creation with pricing)
- API: GET /api/v1/fcc-carrier-registration/:id (status)
- API: GET /api/v1/fcc-carrier-registration/state-fees (formation fees)
- Checkout: fcc_carrier_registration order type with Stripe line items
- Payment handler: dispatch worker + send confirmation email
- Pipeline handler: 8-step CRTC-style pipeline (formation → CORES → 499 →
  DC Agent → State PUC → RMD/CPNI/CALEA/BDC → add-ons → final review)
- Job server dispatch map entry
- Service page CTA updated to link to order page

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 08:48:36 -05:00
justin
830f5ae738 All standard registrations included in base $1,299, only add-ons are extra
Base package includes CORES/FRN, Form 499, DC Agent, RMD, CPNI, CALEA,
BDC — all shown as included with checkmarks. Wizard determines which
are relevant (grayed out if not needed for service type).

Only STIR/SHAKEN (+$499), OCN (+$2,650), State PUC (+$399/state),
and formation are itemized add-ons.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 08:40:56 -05:00
justin
2312edf5df Add FCC Carrier/ISP Registration: migration + order page
Phase 1-2 of the new registration product:
- Migration 075: fcc_carrier_registrations table with full pipeline status,
  service wizard answers, entity choice, pricing, idempotency tracking
- Order page with 5-step wizard:
  1. Service wizard (voice/broadband/wholesale + delivery method + infra needs)
  2. Registration checklist (auto-determined + add-ons with dynamic pricing)
  3. Entity choice (existing FRN search OR new formation with nexus guidance)
  4. Contact & officer info
  5. Review & payment with engagement clickwrap

Still needed: API endpoint, checkout integration, worker pipeline handler.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 08:39:03 -05:00
justin
94ce14dc17 Explain IPES = VoIP provider in plain language, expand service description
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 08:14:55 -05:00
justin
118d24cc1a Rename 'IPES & ISP Registrations' to 'FCC Carrier / ISP Registration'
Updated across 61 static HTML files (nav links), bundles catalog,
service page title/description/heading, and llms.txt.
URL stays /services/telecom/ipes-isp (no redirect needed).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 08:06:47 -05:00
justin
424a7f3b2d Add missing updateResource import
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 01:49:07 -05:00
justin
28b407eea6 Fix portal user linking: use updateResource instead of set_value for child table
set_value doesn't work for child tables like portal_users. Use
updateResource (PUT /api/resource/Customer) which handles it correctly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 01:48:05 -05:00
justin
02d2415d7a Fix escaped backtick that broke Docker Astro build
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 01:42:12 -05:00
justin
e1b95a20fb Show 'Intake Already Completed' screen with Revise button on revisit
When the user revisits a completed intake (intake_data_validated=true),
shows a success screen with Go to Portal and Revise buttons instead of
the blank form. Revise adds ?revise=1 to bypass the check.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 01:40:48 -05:00
justin
314a711e95 Fix: add batch_id + engagement columns to job_server PG query
batch_id was missing from the SELECT, so order_data.batch_id was always
None. This meant the batch email skip in _request_entity_intake never
triggered, causing duplicate intake emails for every batch order.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 01:35:56 -05:00
justin
e49efb7207 Scroll to page title on step navigation instead of wizard body
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 01:31:34 -05:00
justin
f6f4853ab6 Scroll to top of wizard on Next/Back step navigation
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 01:29:32 -05:00
justin
27108b9080 Change prefill notice to 'public sources' instead of 'FCC records'
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 01:28:40 -05:00
justin
1f87ad4554 Add pre-fill review notice on every step, fix CORES address detection
- Yellow banner on each step: "We've pre-filled this from your FCC records.
  Please review carefully and correct any information that is inaccurate."
- Only shows when accessed via token/FRN (post-payment intake)
- CORES address: filter by company name suffix (LLC/Inc/Corp) instead of
  requiring a number — addresses like "PO Box 123" now work

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 01:26:25 -05:00
justin
42f331101e Skip CORES address suggestion when it's just the company name
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 01:24:10 -05:00
justin
63f74e8486 Style officer suggestions as clickable cards with arrow hint
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 01:22:25 -05:00
justin
834d2fc1ee Keep officer suggestions visible after selection, highlight chosen one
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 01:18:11 -05:00
justin
dcdc6df879 Fix CategoryStep crashing non-499-A intake pages
CategoryStep script runs on ALL pages using the Wizard component.
It tried to find #pw-wizard (its inner quiz div) and called
querySelectorAll on it — null on CPNI/RMD/etc pages, crashing the
entire script bundle. This prevented FRN auto-fill, officer
suggestions, and all other intake functionality.

Guard: if #pw-wizard doesn't exist, skip all CategoryStep logic.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 01:13:06 -05:00
justin
acf63eb819 Officer suggestions: use FCC data (RMD contact, CORES address) instead of entity_cache
Entity cache has no RA/officer data yet. Instead, fetch the FCC lookup
(quick mode) and offer RMD contact name + address and CORES principal
address as clickable suggestions to auto-fill Officer 1.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 01:02:18 -05:00
justin
6a0162f0a9 Simplify Officer step: remove count dropdown, officers 2+3 optional
- Removed "How many officers" dropdown — all 3 always visible
- Officers 2 and 3 marked as (optional) in legend
- Only Officer 1 validated (name, title, street, city required)
- Blank optional officers skipped in saved data

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-29 00:56:45 -05:00
justin
59c2d06736 Always show corp suggestions on Officer step, check intake_data for name
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-28 23:37:22 -05:00
justin
f5d307a1e8 Fix corp search LIMIT type: cast to int
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-28 23:05:43 -05:00
justin
159a576157 Add corporate record suggestions on Officer step, search all states
- OfficerStep searches entity_cache for matching corporations when loaded
- Shows clickable suggestions with RA name and address to auto-fill Officer 1
- Pre-fills contact_name/email/phone from entity data (helps data-only filers)
- Corp search endpoint: state param now optional (searches all states)
- Corp search returns registered_agent and principal_address fields

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-28 23:02:16 -05:00