new-site

History

justin 7670608c1a fix(monitoring): render alertmanager.yml from template at deploy (fixes crash loop) Alertmanager does not expand ${ENV} in its YAML, so the committed config with ${TELEGRAM_BOT_TOKEN}/${TELEGRAM_CHAT_ID} crash-looped it (line 24: cannot unmarshal !!str `${TELEG...` into int64) - 11k+ restarts on prod, alerting dead. - rename alertmanager.yml -> alertmanager.yml.template (keeps ${} placeholders) - deploy.sh: envsubst the template into the (gitignored) alertmanager.yml from .env, scoped to the two TELEGRAM vars so the {{ }} Go-template message survives - gitignore the rendered file (contains the bot token) - warns if the vars are unset		2026-06-07 04:49:53 -05:00
..
alert_rules.yml	Fix ContainerHighMemory alert: skip containers with no memory limit	2026-05-01 03:54:16 -05:00
alertmanager.yml.template	fix(monitoring): render alertmanager.yml from template at deploy (fixes crash loop)	2026-06-07 04:49:53 -05:00
blackbox.yml	Fix ERPNext and Forgejo probes	2026-05-01 03:35:45 -05:00
grafana-datasources.yml	Remove fixed uid from Grafana datasource provisioning — Grafana 13 rejects it on fresh boot	2026-05-01 03:09:10 -05:00
prometheus.yml	Fix Forgejo probe: use HTTPS public URL (port 3000 conflicts with Grafana internally)	2026-05-01 03:38:36 -05:00
pw-services-dashboard.json	Fix dashboard stale series + enable Prometheus admin API	2026-05-01 03:43:42 -05:00