No description
Find a file
justin 2f9005693e Add deep service health monitoring for all PW dependencies
Each service gets its own Prometheus probe verifying actual functionality:
- API: /status endpoint (checks DB connectivity, returns 503 if down)
- Workers: /health endpoint (job server responsive)
- ERPNext: API method call (MariaDB + Redis + app all working)
- MinIO: /minio/health/live (storage accessible)
- Listmonk: /api/health (email service + DB)
- Ollama: root endpoint (LLM inference available)
- Umami: /api/heartbeat (analytics tracking)
- Forgejo: root page (git server accessible)
- PostgreSQL: pg_up metric from postgres-exporter
- All HTTPS endpoints: SSL + reachability from outside

Service-specific alerts with context:
- API down = DB may be unreachable
- Workers down = compliance orders not processing
- ERPNext down = CRM inaccessible
- MinIO down = document storage unavailable

Custom Grafana dashboard: "Performance West — Services Overview"
- Service status grid (UP/DOWN with colors)
- Response time charts (internal + HTTPS)
- SSL certificate expiry gauges
- Container CPU/memory per service
- PostgreSQL connections, nginx req/s, active alerts

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-01 03:30:23 -05:00
.claude/projects/-home-justin-projects-performancewest-new-site/memory Initial commit — Performance West telecom compliance platform 2026-04-27 06:54:22 -05:00
api Enable STIR/SHAKEN card in compliance checker with originate/terminate toggle 2026-04-29 10:55:00 -05:00
chrome-extension/fcc-access-helper Initial commit — Performance West telecom compliance platform 2026-04-27 06:54:22 -05:00
docs Add engagement authorization, remove price headers from intake pages, fix duplicate emails 2026-04-28 02:50:02 -05:00
docserver Initial commit — Performance West telecom compliance platform 2026-04-27 06:54:22 -05:00
frappe_adyen Initial commit — Performance West telecom compliance platform 2026-04-27 06:54:22 -05:00
frappe_ca_registry Initial commit — Performance West telecom compliance platform 2026-04-27 06:54:22 -05:00
frappe_crypto Initial commit — Performance West telecom compliance platform 2026-04-27 06:54:22 -05:00
infra Add Prometheus + Grafana + Alertmanager monitoring stack 2026-05-01 02:08:39 -05:00
mcp Initial commit — Performance West telecom compliance platform 2026-04-27 06:54:22 -05:00
monitoring Add deep service health monitoring for all PW dependencies 2026-05-01 03:30:23 -05:00
node-compile-cache/v25.1.0-x64-392347a2-1000 Initial commit — Performance West telecom compliance platform 2026-04-27 06:54:22 -05:00
performancewest_erpnext Initial commit — Performance West telecom compliance platform 2026-04-27 06:54:22 -05:00
scripts Add terminate-only STIR/SHAKEN option across RMD pipeline 2026-04-29 10:59:28 -05:00
site Validate Q1b and Q2 before proceeding to Step 2 2026-04-29 11:46:11 -05:00
src Initial commit — Performance West telecom compliance platform 2026-04-27 06:54:22 -05:00
.gitignore Initial commit — Performance West telecom compliance platform 2026-04-27 06:54:22 -05:00
CLAUDE.md Update CLAUDE.md with complete deployment guide, infrastructure map, and key patterns 2026-04-28 02:54:44 -05:00
deploy.sh Add deploy.sh for git-based deployment 2026-04-28 02:52:45 -05:00
docker-compose.yml Fix nginx-exporter: back to bridge network with host.docker.internal 2026-05-01 03:21:27 -05:00