diff --git a/docs/vm-security-hardening.md b/docs/vm-security-hardening.md new file mode 100644 index 0000000..1535593 --- /dev/null +++ b/docs/vm-security-hardening.md @@ -0,0 +1,60 @@ +# VM / host security hardening + +Prod app host: `207.174.124.71` (Debian 13 trixie, k3s + Docker, SSH on 22022). + +## Baseline (already in place before 2026-06) +- SSH: `permitrootlogin no`, `passwordauthentication no`, key-only, port 22022. +- fail2ban active (sshd, nginx-badbots). +- unattended-upgrades enabled; 0 pending security updates. +- TLS: Qualys SSL Labs **A+**, SecurityHeaders **A**, full CSP/HSTS-preload + (set in `/etc/nginx/snippets/pw-security.conf` + `pw-site.conf`). + +## CRITICAL issue found + fixed (2026-06-06): wide-open host ports +`ss`/external probing showed a large attack surface reachable from the public +internet (no host firewall; iptables INPUT policy ACCEPT; kube-router does not +firewall host ports; Docker publishes container ports via 0.0.0.0 DNAT): + + - **5432 Postgres** (customer DB!), **6443 k8s API**, **10250 kubelet**, + **3022 Forgejo SSH**, **9100/9101 Listmonk admin**, **3001/3002 APIs**, + **8080, 3033, 3100, 5555/5556, 8888, 4322/4323**, and the hc submission + ports **2526-2528** — all OPEN to the internet. Confirmed via off-network + `/dev/tcp` probes. Only :25 was provider-filtered. + +### Fix +Two layers, installed as a persistent, boot-enabled systemd service +`pw-firewall.service` (auto-rollback timer was used during rollout): + +1. **`/etc/pw-firewall/pw-firewall.nft`** - dedicated `inet pw_fw` table with an + input hook at priority -150 (evaluated before kube-router's ACCEPT). Allows + loopback, established/related, internal subnets (127/8, 172.16/12, k3s + 10.42/16 + 10.43/16, docker/cni/flannel/veth ifaces), ICMP, then a public + allow-list **{ 22, 22022, 80, 443 }** and DROPs all other NEW inbound on + `ens18`. (Port **25 inbound removed** - the VM only SENDS bulk mail; inbound + mail is handled by Carbonio elsewhere. Outbound :25 egress is unaffected.) + +2. **`/usr/local/sbin/pw-docker-fw.sh`** - because Docker-published ports are + DNAT'd and traverse FORWARD (not input), this adds DOCKER-USER rules: + RETURN established, **DROP NEW inbound on `ens18`** (scoped to the uplink so + container<->container + nginx->container loopback are untouched), RETURN. + Re-applied on docker restart via `docker.service.d/pw-firewall.conf` + ExecStartPost. + +### Verified after fix (off-network) +- Sensitive ports 5432/3022/9100/9101/8080/3001/3002/3100/3033/6443/10250/2526/25 + -> **blocked**. Public 80/443/22022 -> **open**. Site HTTPS -> 200. +- Internal intact: nginx->container loopback (listmonk 9100/9101 = 200), api->DB + `select 1`, container->internet DNS egress, k3s node Ready, outbound mail still + relays to Gmail MX from the hc IPs. + +## Files +- `/etc/pw-firewall/pw-firewall.nft` +- `/usr/local/sbin/pw-docker-fw.sh` +- `/etc/systemd/system/pw-firewall.service` (enabled) +- `/etc/systemd/system/docker.service.d/pw-firewall.conf` + +## TODO / follow-ups +- Consider binding the Docker-published ports to `127.0.0.1` in compose (defence + in depth) so they never bind 0.0.0.0 in the first place - the firewall already + covers it, but compose-level `127.0.0.1:PORT:PORT` is cleaner. +- k8s API (6443) / kubelet (10250): now firewalled; if remote kubectl is ever + needed, allow-list the specific admin source IP rather than reopening.