new-site/infra/postfix/pw-listmonk-rampcap.sh
justin 1e9dcfcfd1 mail(rampcap): step trucking cap back up to 400/h (day 19-20), 500/h ceiling
The day-9 Gmail block that forced the 200/h hold is resolved: per-MX throttling
shipped, Google is excluded entirely (MAIN_EXCLUDE_OPERATORS=google), and the
OpenDKIM signing bug is fixed. With Google out of the mix, 400/h (~4k/day) is
within the envelope these IPs cleanly sustained at 68-76% delivery with zero
blocks. Lets the post-DKIM re-send backlog drain in ~1 day instead of ~3.
2026-06-22 12:49:54 -05:00

56 lines
2.9 KiB
Bash
Executable file

#!/bin/bash
# Ramp the Listmonk hourly send cap (sliding window) in lockstep with the
# Postfix IP warmup, so newly-rotated sending IPs are not blasted.
#
# Driven off the SAME warmup start date as pw-mta-warmup
# (/etc/postfix/pw-warmup-start). Accelerated schedule, justified by historical
# mail.log data showing these IPs cleanly sustained ~2,500 sends/day at 68-76%
# delivery once warm; collapses only ever came from 17k-29k spikes.
#
# Target STEADY-STATE total daily volume (cap is per-hour; ~daily/10 active hrs):
# day 0-1 : ~500/day -> 50/h
# day 2-3 : ~1,500/day -> 150/h
# day 4-6 : ~2,500/day -> 250/h
# day 7-13 : ~4,000/day -> 400/h (IPs cleanly sustained 2.5k+ at 68-76%
# delivery; 0 deferrals/blocks observed)
# day 14+ : ~5,000/day -> 500/h (hard ceiling; never blast past ~17k where
# historical collapses began)
set -euo pipefail
STATE=/etc/postfix/pw-warmup-start
COMPOSE_DIR=/opt/performancewest
PGPASSWORD=pw_dev_2026
[ -f "$STATE" ] || { echo "no warmup start stamp; run pw-mta-warmup --start first"; exit 1; }
START=$(cat "$STATE"); NOW=$(date +%s); DAYS=$(( (NOW - START) / 86400 ))
if [ "$DAYS" -le 1 ]; then RATE=50
elif [ "$DAYS" -le 3 ]; then RATE=150
elif [ "$DAYS" -le 6 ]; then RATE=250
# Recovery -> step back up (2026-06-22): the day-9 Gmail block (400/h, 2026-06-13)
# happened because cold sends concentrated on Google-Workspace business domains
# with no per-MX throttle. All three causes are now fixed: (1) per-MX throttling
# shipped to the trucking pool, (2) Google is EXCLUDED entirely
# (MAIN_EXCLUDE_OPERATORS=google) while its reputation recovers, (3) the OpenDKIM
# signing bug is fixed so mail is no longer junked unsigned. With Google out of
# the mix, 400/h (~4k/day) is within the envelope these IPs cleanly sustained at
# 68-76% delivery with zero blocks. Hold 400/h while we drain the post-DKIM
# re-send backlog, then 500/h hard ceiling.
elif [ "$DAYS" -le 20 ]; then RATE=400
else RATE=500; fi
cd "$COMPOSE_DIR"
psql() { PGPASSWORD=$PGPASSWORD docker compose exec -T -e PGPASSWORD=$PGPASSWORD api-postgres \
psql -U pw -d listmonk -At "$@" 2>/dev/null | grep -v "level=warning" || true; }
CUR=$(psql -c "SELECT value FROM settings WHERE key='app.message_sliding_window_rate';")
if [ "$CUR" != "$RATE" ]; then
psql -c "UPDATE settings SET value='$RATE' WHERE key='app.message_sliding_window_rate';
UPDATE settings SET value='\"1h\"' WHERE key='app.message_sliding_window_duration';
UPDATE settings SET value='true' WHERE key='app.message_sliding_window';" >/dev/null
docker compose restart listmonk >/dev/null 2>&1 || true
logger -t pw-rampcap "day $DAYS -> listmonk cap ${RATE}/h (was ${CUR}/h)"
echo "$(date "+%F %T") rampcap: day=$DAYS cap=${RATE}/h (changed from ${CUR}/h, listmonk restarted)"
else
echo "$(date "+%F %T") rampcap: day=$DAYS cap=${RATE}/h (no change)"
fi