new-site/infra
justin 9b9d317916 infra/k8s: shkeeper liveness+readiness probes (fix recurring crypto.performancewest.net downtime)
crypto.performancewest.net kept going down because the shkeeper-deployment web
pod periodically HANGS (HTTP server deadlocks while the apscheduler background
thread keeps the process alive). The helm chart (shkeeper-1.7.15) ships NO
liveness or readiness probe, so k8s saw the hung pod as Running and never
restarted it, and kept routing traffic to the dead backend -> site down until a
manual restart.

Added HTTP probes on / :5000 (302 = healthy): liveness auto-restarts a hung pod,
readiness pulls it from the Service endpoints. Applied live via kubectl patch
(chart does not expose probes via values; re-apply after any helm upgrade --
command in the file header). Verified: new pod comes up READY 1/1 (probe passes)
and crypto.performancewest.net serves 302 again.
2026-06-09 04:57:50 -05:00
..
ansible healthcare: daily batched paper-filing fulfillment 2026-06-07 00:30:01 -05:00
cron monitoring: daily warmup IP-reputation Telegram alert 2026-06-08 21:06:41 -05:00
fail2ban Initial commit — Performance West telecom compliance platform 2026-04-27 06:54:22 -05:00
firewall security: drop all CBC TLS suites (Qualys WEAK -> AEAD-only, still A+); sync ansible nginx templates (ciphers + ywxi CSP); capture host firewall as IaC 2026-06-06 00:49:21 -05:00
k8s infra/k8s: shkeeper liveness+readiness probes (fix recurring crypto.performancewest.net downtime) 2026-06-09 04:57:50 -05:00
monitoring monitoring: daily warmup IP-reputation Telegram alert 2026-06-08 21:06:41 -05:00
mta-sts infra: MTA-STS HTTPS vhost (cert issued, policy live) 2026-06-06 21:03:30 -05:00
nginx infra: nginx vhost for listmonk-hc admin portal (lists-hc.performancewest.net -> 127.0.0.1:9101, LE cert) 2026-06-06 07:02:50 -05:00
postfix feat(email): wire listmonk-hc into deploy + dev override + hc ramp-cap 2026-06-05 19:19:45 -05:00
systemd hc email: reframe value-add to 'No 2FA. No government portals.' (we have a portal; the pain is CMS 2FA/identity-proofing); cron creates fresh dated campaign when prior is finished; add hc bounce watcher (Postfix->listmonk-hc webhook, hard/complaint->blocklist) 2026-06-06 16:47:12 -05:00