Compare commits

...

2 commits

Author SHA1 Message Date
justin
4d5901921e mail: fix OpenDKIM not signing campaign mail (Docker-injected) + codify in Ansible
Root cause of the Jun 2026 deliverability collapse / 'no new sales':
opendkim.conf was in single-key mode with no InternalHosts, so it signed only
127.0.0.1. Transactional/cron mail (injected locally) was signed, but ALL
campaign mail -- injected over the Docker bridge from the Listmonk containers
(172.18.0.5 trucking, 172.18.0.25 healthcare) -- went out UNSIGNED. Gmail/Yahoo
require DKIM on bulk mail since Feb 2024, so cold campaigns were junked/blocked
(~23% delivery, 550-5.7.1). Proof: 2,620 campaign msgs that day, 0 DKIM sigs.

The correct table files already existed on the server but were never wired into
opendkim.conf. Fix points the daemon at key.table/signing.table and sets
InternalHosts/ExternalIgnoreList to trusted.hosts (which includes 172.16.0.0/12,
the Docker subnet). Fixes BOTH streams: HC submission ports 2526-2528 inherit
the global smtpd_milters and *@performancewest.net covers compliance@.

Verified by injecting from a Docker IP through port 25 and port 2526 -- both now
get 'DKIM-Signature field added'. Codified as new Ansible role 'mail' so it
can't silently regress (OpenDKIM was previously not in IaC at all).
2026-06-17 19:31:19 -05:00
justin
f7212b3969 scripts: one-off fresh password-set link for Paul Wilson (ERPNext auth) 2026-06-17 10:19:53 -05:00
8 changed files with 286 additions and 0 deletions

View file

@ -115,3 +115,59 @@ echo $(( ($(date +%s) - $(sudo cat /etc/postfix/pw-warmup-start)) / 86400 ))
- `/etc/postfix/main.cf.bak.*`
- `/etc/postfix/transport.bak.*`
- `/usr/local/bin/pw-mta-warmup.bak.*`
## Incident: Jun 17 2026 — campaign mail sent UNSIGNED (no DKIM)
**Symptom:** "no new sales." Campaigns were sending (~3-4k/day) but delivery was
~23% (sent 1,802 vs deferred 5,143 + bounced 580), Gmail returned `550-5.7.1
likely unsolicited mail`, and there were **zero clicks since Jun 8** despite
~600 opens/day.
**Root cause:** OpenDKIM was signing **nothing** that came from Listmonk.
`/etc/opendkim.conf` was in single-key mode with **no `InternalHosts`**, so it
defaulted to signing only `127.0.0.1`. Cron/transactional mail is injected
locally (127.0.0.1) so it WAS signed — but campaign mail is injected over the
Docker bridge from the Listmonk containers (`172.18.0.5` trucking,
`172.18.0.25` healthcare). Those clients were not "internal," so OpenDKIM
*verified* (instead of *signed*) them: every cold email went out **unsigned**.
Since Feb 2024 Gmail/Yahoo require DKIM on bulk mail, so unsigned campaigns were
junked/blocked. Proof: `2,620` campaign messages that day, `0` "DKIM-Signature
field added" events, while the every-5-min cron mail was signed.
The correct table files already existed (`/etc/opendkim/{key.table,
signing.table,trusted.hosts}`, and `trusted.hosts` already listed
`172.16.0.0/12`) — they were simply **never wired into `opendkim.conf`**.
**Fix (now codified in Ansible `roles/mail`):** point `opendkim.conf` at the
tables and set the signing scope —
```
KeyTable refile:/etc/opendkim/key.table
SigningTable refile:/etc/opendkim/signing.table
InternalHosts /etc/opendkim/trusted.hosts # includes 172.16.0.0/12 (Docker)
ExternalIgnoreList /etc/opendkim/trusted.hosts
OversignHeaders From
```
then `systemctl restart opendkim`. This fixes BOTH streams at once: the
healthcare submission instances (ports 2526-2528) inherit the global
`smtpd_milters` and the `*@performancewest.net` signing table covers
`compliance@`. Verified by injecting a message from a Docker IP through both
port 25 and port 2526 and confirming "DKIM-Signature field added" for each.
**Verify DKIM is actually signing campaign mail:**
```bash
# Should be NON-ZERO and roughly track campaign volume:
sudo journalctl -u opendkim --since today | grep -c 'DKIM-Signature field added'
# Cross-check: campaign cleanup events today (should be similar order of magnitude)
sudo grep "^$(date '+%b %e')" /var/log/mail.log | grep -c postfix/cleanup
# Key still matches published DNS:
sudo opendkim-testkey -d performancewest.net -s mail -vvv # expect "key OK"
```
**Still TODO from this incident (list quality + content, not yet done):**
- Scrub dead rural/satellite ISPs + dead M365 tenants from audiences and
suppress repeat-deferring/bouncing domains (extend `_email_exclusions.py`).
- Throttle/pause Gmail until reputation recovers (`550-5.7.1` was still firing).
- Add a plaintext (altbody) MIME part — all campaigns are currently HTML-only,
itself a spam signal.
- Fix the self-bounce cron emailing the nonexistent `deploy@performancewest.net`
(~700 self-inflicted `550` bounces/day).

View file

@ -15,6 +15,7 @@
# minio — MinIO object storage + bucket creation
# workers — Python job server + Ollama LLM
# shkeeper — k3s + Helm + SHKeeper (crypto payments: BTC/ETH/USDC/Polygon/TRX/BNB/LTC)
# mail — OpenDKIM signing for outbound Postfix mail (incl. Listmonk campaigns)
# nginx — nginx + certbot TLS for all domains + fail2ban
- name: Provision Performance West server
@ -31,6 +32,7 @@
- workers
- worker-crons
- shkeeper
- mail
- nginx
- monitoring
- security-updates

View file

@ -0,0 +1,22 @@
---
# OpenDKIM signing for outbound mail (Postfix milter).
#
# CRITICAL: campaign mail is injected into Postfix from the Listmonk containers
# over the Docker bridge network, NOT from localhost. OpenDKIM only signs mail
# whose client is in InternalHosts; if the Docker subnet is missing there,
# OpenDKIM *verifies* (rather than *signs*) campaign mail, so every cold email
# goes out UNSIGNED. Since Feb 2024 Gmail/Yahoo require DKIM on bulk mail, so
# unsigned campaigns get junked/blocked (this caused the Jun 2026 deliverability
# collapse: ~23% delivery, Gmail 550-5.7.1). The Docker subnet below MUST be in
# opendkim_internal_hosts.
opendkim_selector: mail
opendkim_signing_domain: performancewest.net
opendkim_socket: "inet:8891@localhost"
# Hosts OpenDKIM will SIGN for (vs verify). Must include the Docker bridge
# subnet so Listmonk container traffic is signed.
opendkim_internal_hosts:
- "127.0.0.1"
- "localhost"
- "172.16.0.0/12" # Docker bridge networks (Listmonk, workers, etc.)
- "10.0.0.0/8"

View file

@ -0,0 +1,10 @@
---
- name: Restart opendkim
ansible.builtin.systemd:
name: opendkim
state: restarted
- name: Reload postfix
ansible.builtin.command:
cmd: postfix reload
changed_when: true

View file

@ -0,0 +1,98 @@
---
- name: Install OpenDKIM + tools
ansible.builtin.apt:
name:
- opendkim
- opendkim-tools
state: present
- name: Ensure OpenDKIM key directory exists
ansible.builtin.file:
path: "/etc/opendkim/keys/{{ opendkim_signing_domain }}"
state: directory
owner: opendkim
group: opendkim
mode: "0750"
- name: Generate DKIM keypair if missing
ansible.builtin.command:
cmd: >-
opendkim-genkey
-b 2048
-d {{ opendkim_signing_domain }}
-s {{ opendkim_selector }}
-D /etc/opendkim/keys/{{ opendkim_signing_domain }}
creates: "/etc/opendkim/keys/{{ opendkim_signing_domain }}/{{ opendkim_selector }}.private"
register: dkim_keygen
- name: Fix DKIM private key ownership
ansible.builtin.file:
path: "/etc/opendkim/keys/{{ opendkim_signing_domain }}/{{ opendkim_selector }}.private"
owner: opendkim
group: opendkim
mode: "0600"
- name: Show DKIM public DNS record to publish (only when newly generated)
ansible.builtin.debug:
msg: >-
A new DKIM key was generated. Publish the TXT record from
/etc/opendkim/keys/{{ opendkim_signing_domain }}/{{ opendkim_selector }}.txt
at {{ opendkim_selector }}._domainkey.{{ opendkim_signing_domain }}
when: dkim_keygen is changed
- name: Deploy OpenDKIM KeyTable
ansible.builtin.copy:
dest: /etc/opendkim/key.table
content: |
{{ opendkim_selector }}._domainkey.{{ opendkim_signing_domain }} {{ opendkim_signing_domain }}:{{ opendkim_selector }}:/etc/opendkim/keys/{{ opendkim_signing_domain }}/{{ opendkim_selector }}.private
owner: root
group: root
mode: "0644"
notify: Restart opendkim
- name: Deploy OpenDKIM SigningTable
ansible.builtin.copy:
dest: /etc/opendkim/signing.table
content: |
*@{{ opendkim_signing_domain }} {{ opendkim_selector }}._domainkey.{{ opendkim_signing_domain }}
owner: root
group: root
mode: "0644"
notify: Restart opendkim
- name: Deploy OpenDKIM trusted/internal hosts (MUST include Docker subnet)
ansible.builtin.template:
src: trusted.hosts.j2
dest: /etc/opendkim/trusted.hosts
owner: root
group: root
mode: "0644"
notify: Restart opendkim
- name: Deploy opendkim.conf (table signing + InternalHosts)
ansible.builtin.template:
src: opendkim.conf.j2
dest: /etc/opendkim.conf
owner: root
group: root
mode: "0644"
validate: "opendkim -n -f -x %s"
notify: Restart opendkim
- name: Ensure OpenDKIM is enabled and running
ansible.builtin.systemd:
name: opendkim
enabled: true
state: started
- name: Wire Postfix to the OpenDKIM milter
ansible.builtin.command:
cmd: "postconf -e {{ item }}"
loop:
- "smtpd_milters={{ opendkim_socket }}"
- "non_smtpd_milters={{ opendkim_socket }}"
- "milter_default_action=accept"
- "milter_protocol=6"
register: postfix_milter
changed_when: false
notify: Reload postfix

View file

@ -0,0 +1,22 @@
Syslog yes
SyslogSuccess yes
LogWhy yes
Mode s
Canonicalization relaxed/simple
Socket {{ opendkim_socket }}
PidFile /run/opendkim/opendkim.pid
UserID opendkim:opendkim
UMask 007
# Multi-domain table-based signing. Lets us add domains/selectors without
# touching the daemon config.
KeyTable refile:/etc/opendkim/key.table
SigningTable refile:/etc/opendkim/signing.table
# Hosts we SIGN for (must include the Docker bridge subnet so Listmonk
# container campaign mail is signed, not just localhost cron mail).
InternalHosts /etc/opendkim/trusted.hosts
ExternalIgnoreList /etc/opendkim/trusted.hosts
# Oversign From to prevent header-injection / replay of an extra From.
OversignHeaders From

View file

@ -0,0 +1,7 @@
# OpenDKIM signing/trusted hosts. Mail whose client matches an entry here is
# SIGNED (InternalHosts) and never treated as external to verify
# (ExternalIgnoreList). The Docker bridge subnet is REQUIRED so campaign mail
# injected by the Listmonk containers is signed -- see roles/mail/defaults.
{% for h in opendkim_internal_hosts %}
{{ h }}
{% endfor %}

View file

@ -0,0 +1,69 @@
/**
* One-off: send Paul Wilson (Compound Technologies, Inc) a fresh password-set
* link so he can log in to the portal.
*
* Context: customer portal auth now uses ERPNext as the single source of truth
* for passwords (commit 9c87759). Paul's old Postgres password is no longer
* used for login, and his previous 7-day set-password link has expired. This
* mints a fresh 7-day reset token and emails ONLY the set-password link
* (NOT the earlier "next steps" email). When he clicks it, /reset-password
* writes his chosen password to ERPNext. CC justin@performancewest.net.
*
* Run in the api container (uses its DATABASE_URL + SMTP_* env), piped via stdin:
* docker exec -i performancewest-api-1 node --input-type=module < scripts/rescue-paul-set-password.mjs
*/
import pg from "pg";
import crypto from "crypto";
import nodemailer from "nodemailer";
const EMAIL = "synthetic@pipeline.com";
const CC = "justin@performancewest.net";
const NAME = "Paul Wilson";
const SITE = process.env.DOMAIN ? `https://${process.env.DOMAIN}` : "https://performancewest.net";
const firstName = NAME.split(" ")[0];
const pool = new pg.Pool({ connectionString: process.env.DATABASE_URL });
const mailer = nodemailer.createTransport({
host: process.env.SMTP_HOST || "co.carrierone.com",
port: parseInt(process.env.SMTP_PORT || "587", 10),
secure: false,
auth: { user: process.env.SMTP_USER, pass: process.env.SMTP_PASS },
});
const FROM = process.env.SMTP_FROM || "Performance West <noreply@performancewest.net>";
const log = (m) => console.log("[rescue] " + m);
// Look up his customers row (portal profile + reset-token owner).
const cust = await pool.query(`SELECT id, email FROM customers WHERE email = $1`, [EMAIL]);
if (cust.rows.length === 0) throw new Error(`no customers row for ${EMAIL}`);
const customer = cust.rows[0];
log(`customers row id=${customer.id} email=${customer.email}`);
// Mint a fresh 7-day reset token.
const token = crypto.randomBytes(32).toString("hex");
const expires = new Date(Date.now() + 7 * 24 * 60 * 60 * 1000);
await pool.query(
`INSERT INTO password_reset_tokens (customer_id, token, expires_at) VALUES ($1, $2, $3)`,
[customer.id, token, expires],
);
const resetLink = `${SITE}/account/reset-password?token=${token}`;
log(`reset token minted, expires ${expires.toISOString()}`);
await mailer.sendMail({
from: FROM, to: EMAIL, cc: CC,
subject: "Set your Performance West password to log in",
html: `<div style="font-family:Arial,sans-serif;max-width:520px;margin:0 auto;padding:24px;color:#222">
<h2 style="color:#1a2744;margin:0 0 8px">Set your password</h2>
<p>Hi ${firstName},</p>
<p>To log in to the Performance West portal and track your filings, click below to
choose your password. This link is valid for 7 days.</p>
<p style="margin:24px 0"><a href="${resetLink}" style="background:#2d4e78;color:#fff;padding:12px 28px;border-radius:8px;text-decoration:none;font-weight:600">Set my password &rarr;</a></p>
<p style="font-size:13px;color:#666">Or paste this link into your browser:<br>${resetLink}</p>
<p style="font-size:13px;color:#666">Once you're in, you can view your orders and complete any remaining intake forms. Questions? Reply to this email or call 1-888-411-0383.</p>
<p style="font-size:12px;color:#9ca3af">Performance West Inc. &middot; performancewest.net &middot; 1-888-411-0383</p>
</div>`,
text: `Hi ${firstName}, set your Performance West password to log in: ${resetLink} (valid for 7 days). Questions? 1-888-411-0383.`,
});
log(`password-set link sent to ${EMAIL} (cc ${CC})`);
await pool.end();
log("DONE");