mail: DMARC aggregate-report parser + dedicated dmarc@ mailbox ingestion

Tool 2 of the deliverability monitoring pair (Tool 1 = mail_reputation_monitor).
DMARC rua reports from dozens of operators (Google, Yahoo, Comcast, Cox, Bell,
Mimecast, Cisco ESA, GMX, mail.com, ...) were landing in ops@ (dmarc@ was a DL),
burying real mail and never parsed. Now ingested + queryable:

- dmarc@performancewest.net converted DL -> dedicated Carbonio mailbox; isolated
  IMAP creds in server .env, surfaced to workers in docker-compose.yml (mirrors
  OPS_IMAP_*). 29 historical reports moved ops@ -> dmarc@ via IMAP.
- scripts/dmarc_report_parser.py: IMAP fetch unseen -> decompress .gz/.zip/.xml
  (namespace-agnostic: classic + urn:ietf:params:xml:ns:dmarc-2.0 GMX/mail.com) ->
  parse aggregate XML -> upsert dmarc_report (keyed (org_name,report_id), no-op on
  re-parse) + dmarc_record per source IP. dmarc_pass = dkim_aligned OR spf_aligned.
  Marks \Seen. --dry-run/--all/--alert (7d per-IP summary + Telegram if one of OUR
  IPs <95% pass, or EXTERNAL IP sends >=20 failing msgs as us = spoofing under
  p=reject). psycopg2 imported lazily so --dry-run runs without the driver.
- api/migrations/102_dmarc_aggregate.sql: dmarc_report + dmarc_record tables.
- infra/cron/pw-dmarc-parser: 06:20 UTC daily --alert (after reputation, before scrub).
- docs/deliverability.md: DMARC section DONE; query examples.

Verified: dry-run --all parses all 28 reports (1 non-report test probe), 0 unknown
after the namespace fix.
This commit is contained in:
justin 2026-06-19 08:50:20 -05:00
parent b45332b5f7
commit 8e5590b492
5 changed files with 509 additions and 8 deletions

View file

@ -0,0 +1,17 @@
# Nightly DMARC aggregate-report ingestion. Fetches the day's rua reports from the
# dedicated dmarc@performancewest.net mailbox (Google, Yahoo, Comcast, Cox, Bell,
# Mimecast, Cisco ESA, GMX, mail.com, Microsoft, ...), decompresses + parses the
# XML, and upserts per-source-IP SPF/DKIM/DMARC alignment into dmarc_report /
# dmarc_record. This is the authoritative cross-operator view of who sends mail AS
# us and whether it passes alignment -- the payoff of this session's DKIM/subdomain
# fixes -- and it flags any UNKNOWN IP sending as us (spoofing) under our p=reject.
#
# --alert prints the last-7d per-IP alignment summary and sends a Telegram warning
# if one of our own IPs drops below 95% DMARC pass, or an external IP sends >=20
# failing messages as us. Marks processed messages \Seen so each run only handles
# new reports (idempotent; reports are also keyed (org_name, report_id) in the DB).
#
# The mailbox is IMAP-reachable from the network and the DB lives inside the docker
# network, so we run inside the workers container (which has DMARC_IMAP_* + DATABASE_URL
# from .env). Runs at 06:20 UTC (after 06:10 reputation, before 06:30 scrub).
20 6 * * * deploy cd /opt/performancewest && docker compose exec -T workers python3 -m scripts.dmarc_report_parser --alert >> /var/log/pw-dmarc-parser.log 2>&1