No description
Find a file
justin a4a5500bfc Add Prometheus + Grafana + Alertmanager monitoring stack
Full observability stack with Telegram alerting:

Components:
- Prometheus: metrics collection, 90-day retention
- Grafana: dashboards at monitoring.performancewest.net
- Alertmanager: routes alerts to Telegram bot
- node-exporter: OS metrics (CPU, RAM, disk, network)
- cAdvisor: container metrics (CPU, memory, restarts)
- postgres-exporter: PostgreSQL connection/query metrics
- nginx-exporter: request rate, 5xx errors, connections
- blackbox-exporter: HTTP/TCP endpoint probing + SSL cert checks

Alert rules:
- Service down (HTTP probe, TCP port, container missing)
- Container restart loops
- High CPU/memory/disk/load
- PostgreSQL down or high connections
- SSL cert expiring (14d warning, 3d critical)
- Slow HTTP responses, high 5xx rate

Blackbox probes all public endpoints:
  performancewest.net, api, dev, crm, lists, analytics,
  minio, crypto, pay

Telegram alerts: critical=1h repeat, warning=6h repeat,
  auto-resolve notifications

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-01 02:08:39 -05:00
.claude/projects/-home-justin-projects-performancewest-new-site/memory Initial commit — Performance West telecom compliance platform 2026-04-27 06:54:22 -05:00
api Enable STIR/SHAKEN card in compliance checker with originate/terminate toggle 2026-04-29 10:55:00 -05:00
chrome-extension/fcc-access-helper Initial commit — Performance West telecom compliance platform 2026-04-27 06:54:22 -05:00
docs Add engagement authorization, remove price headers from intake pages, fix duplicate emails 2026-04-28 02:50:02 -05:00
docserver Initial commit — Performance West telecom compliance platform 2026-04-27 06:54:22 -05:00
frappe_adyen Initial commit — Performance West telecom compliance platform 2026-04-27 06:54:22 -05:00
frappe_ca_registry Initial commit — Performance West telecom compliance platform 2026-04-27 06:54:22 -05:00
frappe_crypto Initial commit — Performance West telecom compliance platform 2026-04-27 06:54:22 -05:00
infra Add Prometheus + Grafana + Alertmanager monitoring stack 2026-05-01 02:08:39 -05:00
mcp Initial commit — Performance West telecom compliance platform 2026-04-27 06:54:22 -05:00
monitoring Add Prometheus + Grafana + Alertmanager monitoring stack 2026-05-01 02:08:39 -05:00
node-compile-cache/v25.1.0-x64-392347a2-1000 Initial commit — Performance West telecom compliance platform 2026-04-27 06:54:22 -05:00
performancewest_erpnext Initial commit — Performance West telecom compliance platform 2026-04-27 06:54:22 -05:00
scripts Add terminate-only STIR/SHAKEN option across RMD pipeline 2026-04-29 10:59:28 -05:00
site Validate Q1b and Q2 before proceeding to Step 2 2026-04-29 11:46:11 -05:00
src Initial commit — Performance West telecom compliance platform 2026-04-27 06:54:22 -05:00
.gitignore Initial commit — Performance West telecom compliance platform 2026-04-27 06:54:22 -05:00
CLAUDE.md Update CLAUDE.md with complete deployment guide, infrastructure map, and key patterns 2026-04-28 02:54:44 -05:00
deploy.sh Add deploy.sh for git-based deployment 2026-04-28 02:52:45 -05:00
docker-compose.yml Add Prometheus + Grafana + Alertmanager monitoring stack 2026-05-01 02:08:39 -05:00