Two correctness fixes that gate enabling the coupon test:
1. On-the-fly pricing. The coupon block hardcoded '$79 $47' (only true at 40%
off) — a false claim on the 20/30% arms. Now build_trucking_campaigns.py
reads api/src/service-catalog.ts (same source checkout uses) and computes
coupon_price_full / coupon_price_deal per recipient as full - round(full*pct/100),
exactly matching the server. Service-fee-only; non-discountable services
(boc3-filing passthrough) get NO price and fall back to percent-only copy.
Quotes the service the email is ABOUT (mcs150 $79, reactivation $149), not the
bundle the CTA happens to link to. service-catalog.ts now ships in the worker
image; helper degrades to percent-only if it can't be read.
2. CTA URL bug (likely a big driver of the zero-click problem). Main campaign
CTAs render '/order/slug&utm_source=...' (no '?') -> HTTP 404, verified live.
Deficiency CTAs would double-'?' once a coupon added '?code='. lp_link now
owns the query (?dot=...&code=...) so every template appends with a leading
'&' and is valid in all 4 states (main/deficiency x coupon on/off), verified
against live URLs returning 200.
Deficiency _deal_box now shows real was/now prices (percent-only for boc3).
Tests: 7/7 pass (adds URL-wellformed + price-matches-checkout cases).
- CAMPAIGN_COUPON_AB_PCTS="20,30,40" mints one daily code per arm; each
carrier is bucketed by a stable sha256(email) hash so the split is even
(~33/33/33 verified over 30k) and stable across re-sends (no arm-hopping).
- Each arm's code stores its own percent in discount_codes, so the advertised
discount always matches what checkout applies; redemptions are countable per
code (marker campaign-daily:<date>:<pct>).
- Empty/unset keeps legacy single-arm behavior (COUPON_PCT, legacy marker).
- coupon_attribs() now takes per-recipient pct.
- Tests: scripts/tests/test_coupon_ab.py (5 pass). SpamAssassin: both main
campaigns (186/188) score 0.0 HAM across all 3 arms, coupon block renders
clean; harness saved for re-runs.
Found during a bug-review pass of the one-email-per-provider work:
1. assign_all overwrite bug: an email on MULTIPLE rows (shared practice inbox /
multiple NPIs -- 2,592 such emails, 299 with mixed status) was assigned by
the LAST row, so a less-urgent row could clobber an urgent one (overdue ->
free check). Now keeps the most-urgent (lowest-priority) assignment.
2. warm_segment double-import + wrong-row render: all of an email's rows passed
the candidate filter, so it could be imported twice (over-counting the slice)
and attribs_for could render a sibling row's blank due-date in the overdue
email. Now requires row_matches(seg) for the specific row AND dedupes by
email (one row per email).
3. free-check email rendered broken text ('last updated on -- about years
ago', 'Last updated . ~ yrs ago') for any provider whose NPPES date isn't
cached yet (the free check goes to everyone, and the fill is gradual). Wrapped
the example sentence + official-record card in listmonk {{ if
.nppes_last_updated }}...{{ else }}...{{ end }}; added a date-free else
branch. altbody keeps the conditionals (listmonk evaluates body+altbody), and
the test/preview renderer gained a minimal {{ if/else/end }} evaluator so
previews match real sends. Verified both branches render with zero unfilled
tokens.
4. cross-cron double-send: pw-hc-campaign (warmup file) and pw-hc-nppes (63k
file) share state but tracked imports per-segment; 312 emails overlap both
files, so a provider could get an urgent email from one cron AND the free
check from the other. Added load_all_imported() global guard (union of all
segment state) so each provider gets exactly one healthcare email overall.
All verified: assignment regression test (10 cases) + new dup-email/guard checks
pass; all 6 templates render clean.
Make the free NPI compliance check the catch-all for ALL verified institutional
providers, but route anyone with a more important/time-sensitive issue to THAT
email instead -- each provider gets exactly one email, their most urgent.
- SEGMENTS gain a 'priority' (lower=more urgent): reactivation 10, revalidation
overdue 20, due-soon 30, bundle 45, free-NPI-check 100 (catch-all).
- assign_segment()/assign_all(): route each provider to the single
highest-priority active segment whose selector matches; warm_segment() takes
the assignment map and only claims its assigned providers (disjoint pools, no
double-mailing). main() now splits the daily slice by priority order, serving
urgent segments fully before the broad free-check consumes the remainder.
- nppes_outdated selector -> 'institutional_default' (every verified, non-
deactivated row), since the free check's value no longer depends on staleness;
list/campaign renamed 'HC Warmup - Free NPI Check'.
- FIX latent bug: reactivation selector treated 'not on CMS reval list' as
deactivated -- false for org NPIs (would mis-tell active practices they're
deactivated). Now uses the REAL nppes_deactivated flag (or OIG/SAM exclusion).
- Drop blanket oig_screening from the default rotation: it matched every row and
would starve the catch-all, and the free check already screens OIG/SAM and
routes to the paid fix on a hit. Still runnable via --segments.
- Add scripts/test_segment_assignment.py (10 cases incl. 'overdue AND stale ->
overdue wins'); all pass.
Pivot the weakest healthcare email from an 'your record is out of date -> buy an
update' sell into a free, value-first compliance check (the funnel already
exists: /tools/npi-compliance-check + /api/v1/npi/lookup run 5 live gov checks --
NPI status, Medicare revalidation, OIG/SAM exclusions, NPPES freshness -- and
deep-link to the right paid fix).
- Subject: 'A free compliance check for your NPI' (was 'may be out of date').
- Header: 'Free NPI Compliance Check' covering NPPES/revalidation/exclusions/NPI.
- Body: keep the REAL last_updated date as a credibility hook ('we pulled your
public records'), but frame it honestly ('that's usually fine') and pivot to
the broader free check. Adds a 4-item 'your free check covers' card.
- CTA now -> /tools/npi-compliance-check?npi={npi} (prefills + auto-runs their
own check on landing) with @TrackLink + UTM; dropped the straight-to-order
NPPES CTA and the redundant 'look up on NPPES' button.
- Reassurance reframed to free-first ('the check is completely free; a fix is
optional, flat-fee'). cta_path updated in the segment registry.
- Verified: render + plaintext + headless screenshot, CTA tracked, no stray
order link, zero unfilled tokens.
An old NPPES last_updated date does NOT mean the practice closed or that CMS
penalizes them: an NPI never expires and there is no NPPES login schedule. Many
records are stale precisely because nothing changed. Removed the overclaim that
an old record 'has almost certainly drifted' and the false 'attest periodically'
duty. Now states the real rule (correct NPPES within 30 days of a change) and
makes the harm conditional ('if anything has changed since then, your record is
now out of date'). Keeps NPPES distinct from Medicare revalidation/PECOS, which
is the separate segment that actually carries deactivation stakes.
- Add NPPES_STALE_MAX_YEARS (default 10): a record untouched for many years is
a stronger signal the practice closed/moved, and a bounce burns the warming
IP. Observed institutional distribution clusters 3-7yrs with ~0 beyond 8, so
10 is a safe ceiling that mails the whole real pool while excluding any
outlier ancient record. MIN stays 3 (keeps the 'out of date' claim credible).
- Restore the SMTP-verification gate (verify_ok) that the shared
institutional_verified selector had -- the swap to nppes_stale dropped it; we
only mail inboxes we already proved live.
- enrich: process the re-fetch queue STALEST-FIRST so a bounded (--limit) or
--max-age refresh spends its budget on the most-overdue cache entries (and new
NPIs) first, never starving them behind merely-aging ones.
- Selector unit-tested (10 cases incl. window edges, verify gate, deactivated).
The NPPES 'may be out of date' email previously asserted staleness with no
per-record evidence (softened earlier to a generic 'periodic review required').
NPPES is fully public and every record carries basic.last_updated, so we now
cite the actual government date the provider can verify on the registry.
- enrich_nppes_last_updated.py: joins real basic.last_updated /
enumeration_date / deactivated onto the institutional list via a cached,
resumable per-NPI crawl (no batch endpoint exists). Adds nppes_last_updated,
nppes_enumeration, nppes_years_stale, nppes_deactivated.
- cron: new 'nppes_stale' selector mails ONLY records >= 3yrs stale (env
HC_NPPES_STALE_MIN_YEARS) and excludes deactivated NPIs; empty date => no
match, so we never claim staleness without the government date to back it.
- template: headline + official-record card now show the real last_updated
date and ~N-years-ago, sourced to npiregistry.cms.hhs.gov.
- attribs + test SAMPLE expose the new fields; verified render + plaintext.
Diagnosing zero healthcare sales (11k sent, 5479 opens, 0 clicks, 0 orders).
Root cause of clicks=0: Listmonk only registers a link for tracking when the
href ends with the literal @TrackLink marker; all 10 hc templates lacked it
(trucking/CRTC have it). So the entire funnel was unmeasurable below 'open'.
Changes:
- Click tracking: append @TrackLink + UTM to every /order/ CTA across all 10
templates (external gov self-verify links left untracked on purpose).
- Remove all service prices from emails (99/49/49/99yr/9mo). Price is
now revealed on the order page after value is established; catalog
(api/src/service-catalog.ts) stays source of truth. Kept the 0,000 OIG
penalty stat (regulatory fact, not our price). Added a neutral 'flat fee shown
up front' reassurance block where the fee table used to be.
- Compliance/honesty: the nppes_outdated email asserted a per-record
'FLAGGED OUT OF DATE / detected' status, but its selector only checks
deliverability and the data has no NPPES last-updated field -> unsubstantiated
for every recipient. Reframed to a generally-true periodic-attestation message
('PERIODIC REVIEW REQUIRED', 'most practices drift out of date'). Same hedging
applied to npi_reactivation ('may be deactivated ... confirm on official
sources'). Substantiated reval 'past due' claims (backed by the public CMS
Revalidation list) were kept.
- Fixed stale $299 OIG metadata in build script -> $79/mo (reference only).
Docs: docs/healthcare-competitive-pricing.md (benchmark research) and
docs/healthcare-email-compliance-review.md (CAN-SPAM / FTC / impersonation pass;
flags SOC2/HIPAA/PCI badge claims for owner confirmation).
Verified headless: all 10 render with 0 JS errors, exactly 1 tracked CTA each,
no price leaks.
- New 'Choose your province: BC or Ontario' comparison card (entity name,
registered office city, fees, annual return, portal, area codes, corp tax)
inserted above the carriers banner. Previously Ontario was only mentioned
in a buried FAQ; BC outnumbered ON 53:12.
- Tax-comparison H2 + collapse-menu label now read 'British Columbia / Ontario'
and the key-takeaway notes ON is ~12.2% (within ~1pt of BC).
- Made hero chip, 'what we deliver' (registered office + file corporation),
and banking copy province-aware (BC or Ontario) instead of BC-only.
- Verified headless: province card renders, H2 visible (not auto-collapsed),
13 accordions + proof expander intact, 28 Ontario mentions, no new JS errors.
Completes the MX-exclusion plan. Untagged carriers can't be excluded (the big-MX
gate is MX-based, so an unresolved Google/Yahoo domain would slip through), and
were previously UNCAPPED in select_sendable_carriers -- a flood of freshly-imported,
never-resolved domains could dominate a run before pw-mx-tag resolves them.
Added a single shared untagged_cap (env MAIN_UNTAGGED_MX_CAP, default max(quota,200))
so untagged sends are bounded without starving the pool: at the default the bucket
can still fill an entire run's quota (no behavior change today), but the cap can be
tightened to a fraction once pw-mx-tag has drained the backlog -- which is fast,
since only ~3,035 distinct *verified-sendable* untagged domains remain (< one
20k/day tag run). Tagged carriers keep their per-operator caps unchanged.
Verified: compiles; cap logic never starves at default, enforces the limit when
set lower.
Fix 1 (consumer mx: exclusion) and Fix 3 (pw-mx-tag cron) live as of 9eeed47.
Verified: warmup pool 353,909 after fix (not starved), mx:yahoodns.net cap=0
during warmup, cron tags idempotently. Fix 2 (NULL bucket cap) deferred.
Fix 1 (build_trucking_campaigns.py): the warmup big-MX exclusion only covered the
clean-label operators (google/microsoft/proofpoint/...). Consumer mailbox
operators that mx_tag_carriers.py labels with an "mx:" prefix slipped BOTH the
exclusion and the per-MX throttle -- notably mx:yahoodns.net (283k sendable
carriers = Yahoo Small Business/AOL custom domains) and mx:icloud.com (25k), plus
comcast/charter/centurylink/windstream/tds/earthlink. These are custom domains
whose MX points at a consumer provider, invisible to the literal-domain blocklist.
Added CONSUMER_MX_OPERATORS, folded into WARMUP_EXCLUDE_OPERATORS used by both the
fetch_carriers() exclusion SQL and mx_daily_caps() (same day-30 ramp). Behind the
existing MAIN_SKIP_BIG_MX switch.
Validated read-only: after the fix the warmup-eligible pool is 353,909 carriers
(315,892 untagged + ~38k genuinely small/self-hosted operators), so the long tail
still sustains the daily quota -- not starved -- while 0 consumer-MX carriers are
selected during warmup.
Fix 3 (infra/cron/pw-mx-tag): mx_tag_carriers.py was on no cron, so the untagged
(NULL) backlog (~316k) never drained and new FMCSA imports stayed untagged,
slowly re-opening the gap. Added a daily 05:45 UTC cron (--only-unsent
--limit-domains 20000), before the 08:00 builder. Idempotent/bounded (only tags
mx_provider IS NULL). Verified live: a 200-domain test run tagged 216 domains.
(Fix 2 -- bounding the NULL bucket cap -- deferred; the cron will drain it.)
Analysis-only plan (no code shipped). The trucking builder's warmup excludes
big receiving operators (Google/MS/Proofpoint/...) by mx_provider, but three
holes let throttling/consumer MX through during the day<=30 window:
1. Consumer operators tagged with the "mx:" prefix (mx:yahoodns.net = 283,113
sendable carriers, mx:icloud.com = 24,985, comcast/charter/centurylink/...)
are NOT in BIG_MX_OPERATORS, so they slip both the exclusion and the throttle.
These are custom domains whose MX points at Yahoo/iCloud -- invisible to the
literal-domain blocklist, only catchable via MX tagging. Biggest hole.
2. 315,892 untagged (NULL) sendable carriers are sent to unvetted (kept by design
for anti-starvation, but uncapped).
3. mx_tag_carriers.py is on no cron, so the NULL backlog never drains and new
FMCSA imports stay untagged -- slowly re-opening gaps 1 and 2.
Plan proposes: CONSUMER_MX_OPERATORS set folded into exclusion+throttle (behind
the existing MAIN_SKIP_BIG_MX switch), a bounded cap on the NULL bucket, and a
daily pw-mx-tag cron. Includes live numbers, validation steps (dry-run selector
diff, no sends), and open decisions (re-introduction ramp, permanent vs warmup-
only exclusion for Yahoo/iCloud custom domains).
Campaign 509 (CRTC USF Q3, 4,156 sent) shipped with raw <a href> URLs, so
Listmonk never registered the links and recorded ZERO clicks -- even though
Umami logged the real order-page visits AND a carrier phoned in after clicking.
Same mistake docs/fmcsa-trucking-plan.md already flagged ("Use @TrackLink on all
CTAs"); the trucking campaigns do it, the CRTC one didn't.
Listmonk only tracks a link when its href ends with the literal @TrackLink marker
(it strips it and rewrites through lists.performancewest.net/link/). Added a
_track() helper that appends UTM params (so Umami attributes the visit too) +
@TrackLink, applied to both the primary order CTA and the guide-PDF download.
The running campaign 509's body was also patched live in the DB (same two links)
so its remaining sends record clicks. Future CRTC campaigns get it from source.
First live ingest (28 reports) showed our warmup rotation pool (.91-.109, out0x)
mislabeled EXTERNAL because OUR_IPS only listed 4 specific IPs -- every one was
100% DMARC-passing, clearly ours, and would have generated false spoofing alerts.
Replace the literal-IP set with an ipaddress subnet check on 207.174.124.0/24
(our whole block). The only genuinely-external failing sender is 35.174.145.124
(AWS, 32 msgs spoofing us, SPF-fail/no-DKIM, all correctly rejected by p=reject) --
exactly the signal the --alert path is meant to surface.
Tool 2 of the deliverability monitoring pair (Tool 1 = mail_reputation_monitor).
DMARC rua reports from dozens of operators (Google, Yahoo, Comcast, Cox, Bell,
Mimecast, Cisco ESA, GMX, mail.com, ...) were landing in ops@ (dmarc@ was a DL),
burying real mail and never parsed. Now ingested + queryable:
- dmarc@performancewest.net converted DL -> dedicated Carbonio mailbox; isolated
IMAP creds in server .env, surfaced to workers in docker-compose.yml (mirrors
OPS_IMAP_*). 29 historical reports moved ops@ -> dmarc@ via IMAP.
- scripts/dmarc_report_parser.py: IMAP fetch unseen -> decompress .gz/.zip/.xml
(namespace-agnostic: classic + urn:ietf:params:xml:ns:dmarc-2.0 GMX/mail.com) ->
parse aggregate XML -> upsert dmarc_report (keyed (org_name,report_id), no-op on
re-parse) + dmarc_record per source IP. dmarc_pass = dkim_aligned OR spf_aligned.
Marks \Seen. --dry-run/--all/--alert (7d per-IP summary + Telegram if one of OUR
IPs <95% pass, or EXTERNAL IP sends >=20 failing msgs as us = spoofing under
p=reject). psycopg2 imported lazily so --dry-run runs without the driver.
- api/migrations/102_dmarc_aggregate.sql: dmarc_report + dmarc_record tables.
- infra/cron/pw-dmarc-parser: 06:20 UTC daily --alert (after reputation, before scrub).
- docs/deliverability.md: DMARC section DONE; query examples.
Verified: dry-run --all parses all 28 reports (1 non-report test probe), 0 unknown
after the namespace fix.
Runs mail_reputation_monitor --alert at 06:10 UTC, piping the day's postfix log
(sudo cat, same pattern as pw-warmup-tg-alert) into the DB-connected workers
container. Builds the daily SNDS-equivalent reputation trend and Telegram-alerts
on operator regressions. Installed to /etc/cron.d/pw-mail-reputation.
Adds scripts/mail_reputation_monitor.py + migration 101 (mail_reputation_daily).
Sender reputation is judged by the RECEIVING operator (Microsoft/Google/Yahoo/
Proofpoint), and the provider portals (SNDS/Postmaster/CFL) need a login and lag
24-48h. Our postfix logs already carry the ground truth in real time: every send
records the receiving host + SMTP response, and the response classifies WHY:
250 -> accepted
451 4.7.500 -> throttled (Microsoft rate-limiting a cold IP)
550 5.7.x -> reject_reputation (spam/reputation)
550 5.1.1/5.4.1-> reject_recipient (dead mailbox / access denied = list hygiene)
550 ...SPAM -> reject_content (SpamAssassin)
The parser classifies each egress delivery (out0x/hcout/relay) by (sending_ip,
receiver, outcome, reason_code) and upserts ONE daily aggregate row per bucket
(idempotent ON CONFLICT), so a nightly cron over the rotated log gives a queryable
trend without re-parse double-counting. --alert prints a per-operator summary and
Telegram-alerts on regressions (>=10% reputation rejects, or Microsoft >=70%
throttled). Reads stdin ("-") so the host-owned /var/log/mail.log can be piped
into the DB-connected workers container.
Motivation: 2026-06-19 audit found ~80% of Microsoft sends were getting 451 4.7.500
throttles on the warming IPs -- this makes that trend visible as reputation recovers.
performancewest.net + send.performancewest.net both show Enrolled in the Yahoo
Sender Hub, reporting email fbl@. All three FBLs (Google Postmaster, MS SNDS+JMRP,
Yahoo CFL) now complete.
Added yahoo-verification-key TXT records via Hestia for performancewest.net
(apex) and send.performancewest.net; both propagated to all HE.net slaves +
public resolvers. Ready to click Verify in the Yahoo CFL form, complaint dest fbl@.
SNDS access requested/granted for 207.174.124.94 + .107; JMRP feeds registered
with complaint dest fbl@. Section marked complete. SNDS data populates in ~24-48h.
Note JMRP delivers ARF complaints to the signed-in MS account's email, not
automatically to fbl@; set a forward if that account isn't fbl@performancewest.net.
Use the legacy sendersupport.olc.protection.outlook.com/snds/ (308-redirects) or
the direct substrate.office.com/ip-domain-management-snds/SNDS app URL. Flag that
snds.microsoft.com has no DNS.
SNDS moved off sendersupport.olc.protection.outlook.com to
substrate.office.com/ip-domain-management-snds/. The old /snds/ and /pm/ links
308-redirect there. Document that the footer/help links going to microsoft.com
are boilerplate (not broken), and that you must Log in FIRST or the Request
Access / JMRP links bounce to login.microsoftonline.com (expected, not dead).
Add working direct links + canonical https://snds.microsoft.com entry point.
Records the Apple/iCloud addition, the builder-vs-list-based distinction, the
scrub_listmonk_consumer reconciliation tool + daily cron, and the 2026-06-19
first-run numbers (7,943 trucking + 21 HC stale consumer subs blocklisted).
Runs scrub_listmonk_consumer against both listmonk and listmonk_hc at 06:30 UTC,
before the campaign builders, so any ENABLED subscriber matching the authoritative
exclusion list is blocklisted retroactively. Keeps list-based campaigns (FCC
Direct Contacts, CRTC/USF, etc.) from leaking onto consumer mailboxes after a new
domain (e.g. Apple/iCloud) is added to the exclusion list. Installed to
/etc/cron.d/pw-listmonk-scrub on the host.
The fmcsa campaign builders already exclude gmail/yahoo/microsoft/etc. from NEW
audience selections, but two reputation leaks remained on the LIST-BASED side:
1. iCloud/Apple gap. icloud.com/me.com/mac.com were never in the exclusion set.
A 2026-06 Listmonk audit found 1,321 ENABLED iCloud subscribers on list 3
("FCC Carriers - Direct Contacts") -- the single largest enabled-consumer
bucket -- being cold-blasted with no exclusion at all. Add APPLE_CONSUMER_DOMAINS.
2. Stale already-imported consumer subs. List-based campaigns (e.g. the running
CRTC/USF blast on list 3) keep hitting consumer addresses imported BEFORE the
relevant domain joined the exclusion list. gmail.com was still the #1 bounce
domain via that campaign even though new selections exclude it. Add
scrub_listmonk_consumer.py: reconciles the live Listmonk subscriber table
against the authoritative exclusion list and blocklists any ENABLED subscriber
whose address is_blocked(). Idempotent; re-run whenever the exclusion grows so
it applies retroactively. Uses the same 'blocklisted' terminal state as the
bounce handler, so contacts are excluded from all current/future campaigns
without deleting history. Supports --dry-run and both listmonk / listmonk_hc.
Created postmaster@/abuse@/fbl@/dmarc@ as Carbonio DLs -> ops@ (they previously
REJECTED 5.1.1, which would have blocked SNDS verification AND was silently
dropping all DMARC aggregate reports). Verified accept-at-MX + delivered E2E.
Reframe Microsoft as the #1 monitoring priority (85% of audience), Yahoo as
lowest (<1%); add Carbonio admin access note; note DMARC parser now worth building.
- infra/ansible/roles/mail: refactor OpenDKIM to support multiple signing domains
via opendkim_signing_domains list (root + send.performancewest.net). Loops
keygen/ownership/keytable/signingtable so the live two-domain setup is
reproducible from ansible.
- infra/ansible group_vars: add bulk_mail_subdomain + campaign_from_* +
campaign_reply_to documentation vars (map to CAMPAIGN_FROM / HC_CAMPAIGN_FROM
env read by the builder scripts). smtp_from (transactional) stays on root.
- docs/deliverability.md: rewrite TL;DR with the carrierone-vs-performancewest
A/B proof (same server/IPs, different From domain -> Inbox vs Junk) and the
~85% Microsoft / 14% Google / <1% Yahoo audience mix; add the bulk-subdomain
section, SPF trim, rehab-disabled, and the Hestia DNS automation runbook.
Isolates bulk sending reputation onto a dedicated subdomain so the root domain
stays clean for transactional/verification mail (and recovers faster). Replies
still go to the root domain via Reply-To, so the customer-facing reply experience
is unchanged.
- build_trucking_campaigns.py: add env-overridable FROM_EMAIL
(noreply@send.performancewest.net); use it for both scheduled + test sends
instead of inheriting base["from_email"] from the DB base campaign.
- build_healthcare_campaigns_cron.py: FROM_EMAIL ->
compliance@send.performancewest.net (env-overridable).
- bounce-watcher.sh / hc-bounce-watcher.sh: track the new subdomain envelope
sender (keep legacy root-domain sender so the pre-cutover queue still drains;
HC also tracks by hcout transport regardless of sender).
Infra already live (separate, non-code): subdomain DNS (A/MX/SPF/DKIM
selector=send/DMARC p=reject) on the Hestia master, OpenDKIM signs
d=send.performancewest.net (verified end-to-end), egress .94/.107. Root SPF
trimmed to the real IPs; pointless IP-rehab cron disabled.
DNS is fully automatable: Hestia (cp.carrierone.com, zone owner = justin user)
is the DNS master, HE.net are slaves. Added google-site-verification TXT (id
14464) via v-add-dns-record as root; verified resolving on public resolvers +
HE.net slaves. Owner just clicks Verify in the Postmaster console. Documents the
v-add-dns-record path for future records.
Documents the 2026-06-18 reputation incident (snowshoe -> Gmail domain-rep
blocks, RBLs all clean), the single-IP-per-stream consolidation, and
fill-in-the-blanks setup steps for Google Postmaster Tools, Microsoft SNDS/JMRP,
and Yahoo CFL (all require owner account login + HE.net DNS). Plus ongoing
hygiene + how to re-expand IPs once reputation recovers.
The multi-IP rotation was built to spread risk while DKIM was broken (fixed
2026-06-17) and after the May 30-31 over-volume blast. With DKIM signing
correctly, spreading ~3k trucking msgs/day across 12 IPs (.94-.105) + ~1.2k
healthcare msgs/day across 3 IPs (.107-.109) gave each IP far too little
per-receiver volume to build reputation. Gmail/Outlook read it as snowshoe spam
and reputation-blocked ~200 msgs/day ("very low reputation of the sending
domain") -> 0 human clicks, 0 sales.
Consolidate to ONE IP per stream so each accrues real reputation:
- trucking: pw-mta-warmup ALL=(out05) -> randmap collapses to {out05:} = .94
- healthcare: listmonk-hc SMTP servers 2/3 (ports 2527/2528 -> .108/.109)
disabled in DB; all HC mail now egresses .107 (hcmta01). [applied live]
Applied live: transport_maps now randmap:{out05:}; listmonk-hc restarted.
To re-expand later: add transports back to ALL + re-enable the HC SMTP servers.
Checkout half proven against live Stripe (dry-run session created + expired,
zero charge), webhook subscription-id extraction + worker renewal fulfillment
covered by unit tests (31 + 13). Remaining gap: full E2E with a Stripe test
clock, which needs test-mode keys in the server .env (currently unset).
Runs the real _BaseNPIHandler.handle() with _create_todo monkeypatched (no DB /
ERPNext / email side effects) and asserts:
- first OIG/SAM screening has no [Monthly cycle] prefix / RECURRING banner
- a recurring_cycle order gets the [Monthly cycle] title prefix, the
"RECURRING MONTHLY CYCLE" banner, the invoice id, and the re-run-against-
CURRENT-data + issue-NEW-certificate instructions
- recurring_cycle works with and without an invoice id
- the bundle handler's first run is not flagged recurring
Verified passing both locally and inside the deployed workers container.
The 3 subscription-lifecycle events (invoice.paid, invoice.payment_failed,
customer.subscription.deleted) are now enabled on the live endpoint
we_1THBjyB46qMvF2jnYyN8IfkK (6 events total). Documents the unpinned-endpoint
api_version caveat (account default 2024-12-18.acacia, not the SDK's dahlia) and
why invoiceSubscriptionId() must read both invoice shapes. Notes that
charge.dispute.created / balance.available are handled in code but not yet
enabled on the endpoint.
The live Stripe webhook endpoint has NO pinned api_version, so it follows the
account default (currently 2024-12-18.acacia), which delivers the subscription
link as the top-level invoice.subscription. The code only read the new
2026-03-25.dahlia shape (invoice.parent.subscription_details.subscription), so
recurring renewal/payment-failed events would have returned a null subscription
id and silently failed to fulfill once the events were enabled.
invoiceSubscriptionId() now reads the modern shape first, then falls back to the
legacy top-level field. All other invoice fields used by the handlers
(amount_due, attempt_count, hosted_invoice_url, id) are stable across both
versions. +5 tests (legacy string/object, modern-preferred-over-legacy).
Convert OIG/SAM from one-time $299/yr to recurring $79/month (card+ACH only) -
the first real recurring-billing product in the system. Exclusion screening is
a *monthly* federal obligation, so recurring monitoring fits the requirement and
is the biggest valuation lever (vs a one-time annual run).
Catalog (single source of truth):
- service-catalog.ts: add billing_interval + allowed_methods to ComplianceService;
oig-sam-screening -> 7900c, billing_interval:"month", allowed_methods:[card,ach],
name "(Monthly Monitoring)".
- gen-service-catalog.py + check-service-catalog-drift.py: carry/guard the two new
fields; regenerate site catalog.
Checkout (api/src/routes/checkout.ts):
- mode:"subscription" with recurring price_data when billing_interval is set;
surcharge absorbed for recurring (clean $79/mo); server-side METHOD_NOT_ALLOWED
re-validation against allowed_methods.
- ensureColumns + migration 100: compliance_orders.stripe_subscription_id,
bundle_upsell_sent_at (+ subscription index).
Webhooks (api/src/routes/webhooks.ts):
- record stripe_subscription_id on checkout.session.completed (subscription mode).
- invoice.paid (subscription_cycle only) -> re-dispatch screening for the cycle;
invoice.payment_failed -> admin alert + first-failure customer nudge;
customer.subscription.deleted -> mark order cancelled. (API 2026-03-25 moved the
subscription link to invoice.parent.subscription_details.subscription.)
Fulfillment:
- job_server.py: pass recurring_cycle/invoice_id into the order.
- npi_provider.py: OIG handler labels renewal cycles "[Monthly cycle]" + re-screen
note; bundle action runs only the FIRST screening + flags the $79/mo upsell.
Bundle land-and-expand:
- Provider Compliance Bundle now includes only the first OIG/SAM screening (was
giving away $948/yr of monitoring inside an $899 bundle).
- new worker scripts/workers/bundle_upsell.py (+ pw-bundle-upsell timer): ~3 weeks
after a paid bundle, emails the customer to continue $79/mo monitoring; dedup via
bundle_upsell_sent_at; skips customers who already have an OIG/SAM order.
Surfaces updated to $79/mo: PaymentStep (filters methods, "Billed every month,
cancel anytime"), order pages, healthcare index, npi-compliance-check tool (also
fixed stale $699 bundle drift -> $899), hc_oig_screening + hc_compliance_bundle
emails.
Docs: billing.md gains a "Stripe-native Subscriptions" section + a reality-check
banner (Adyen/ERPNext-gateway model documented there is NOT live; Stripe is the
real rail). Fixed run-migrations.yml container name bug
(performancewest-postgres-1 -> performancewest-api-postgres-1, overridable).
Tests: api/tests/recurring-subscription.test.ts (28 assertions) covers catalog
gating, method validation, surcharge suppression, recurring line-item build,
invoiceSubscriptionId extraction, renewal-cycle gating. tsc clean; site build
clean; catalog drift OK.
Manual deploy step: enable invoice.paid, invoice.payment_failed,
customer.subscription.deleted on the Stripe webhook endpoint.
Email security gateways (Microsoft Defender Safe Links / ATP, Proofpoint,
Mimecast, Barracuda, etc.) auto-fetch and often render every link in a
campaign email to scan for malware. The advanced ones drive a real headless
browser, execute JS, and fire Umami pageviews/clicks that masquerade as human
visits -- inflating campaign click-through.
New site/public/js/pw-bot-filter.js queries multiple real-browser signals and
gates Umami via its official data-before-send hook (umamiBeforeSend), dropping
all events when the visitor is a bot. Signals (from empirical chromium probing):
decisive: navigator.webdriver, HeadlessChrome UA, known scanner UAs, zero/
collapsed screen|viewport|outer geometry, window LARGER than the
physical screen (impossible on real HW; uses outerW/H so page zoom
does not false-positive), software GPU rasterizer (SwiftShader/
llvmpipe/swrast via WebGL UNMASKED_RENDERER), zero logical CPUs.
soft (>=2 to trip): tiny screen, inner>screen, low color depth, empty
navigator.languages, no input device (no fine/coarse pointer + no
hover + 0 touch), no WebGL on a desktop UA.
Designed to FAIL OPEN: only strong/corroborated evidence suppresses, so real
visitors (incl. zoomed, privacy-tooled, remote-desktop, kiosk) still count.
Wired before the Umami tag in Base.astro (Astro pages) and all 86 static
public/**/*.html pages; both load with defer so order is guaranteed and the
hook is defined before Umami reads it.
Tested end-to-end with chromium (site/tests/bot-filter.test.sh, 4/4):
default headless-new, spoofed-Windows-UA + normal 1366x768 window, and
spoofed-UA + 1x1 window are all caught; hook returns null to drop the event.
Replaces the panic-era burner-domain verification plan with an in-house
automatic catch-all rollout in the trucking/IFTA/UCR builders. Root-cause
classification of the 75k pre-DKIM-fix bounces showed ~55% were reputation/
auth (now fixed by DKIM signing) and only ~29% genuinely-dead mailboxes;
catch-all domains accept at RCPT time so they do not user-unknown bounce at
send, making a controlled in-house bleed safer than warming a separate burner.
catch_all_enabled() adds catch-all results only when warmup_day >=
CAMPAIGN_CATCH_ALL_MIN_DAY (21) AND the recent 2-day live bounce rate is below
CAMPAIGN_CATCH_ALL_MAX_BOUNCE_PCT (8%) on a >=300-sent sample; auto-reverts to
the clean smtp_valid/send_confirmed pool on the next run if bounces spike.
Short window so a past disaster cannot block the rollout forever and a fresh
spike trips fast. CAMPAIGN_INCLUDE_CATCH_ALL=1/0 still hard-overrides.
USABLE_FILTER (static) -> usable_filter() (per-run, memoized, one DB probe).
IFTA/UCR SELECT_SQL -> _select_sql() so tc.usable_filter() resolves at call
time, not import. 13 logic unit tests pass; live dry-run decision = OFF
(day 15 < 21 and recent 2d bounce 42% from the aging-out Jun-16 disaster).
Reduce evasion optics that would draw FCC enforcement attention while keeping the
real value props:
- 'What they avoid by being Canadian' -> 'What the Canadian structure changes'
- Drop 'No US telecom taxes on invoices (15-40% saved)' -> Canadian tax treatment
on the Canadian entity's billing; 'No US FCC regulatory fees on the Canadian entity'
- '...avoid this by routing US traffic...' -> '...instead route US traffic through
US intermediaries who carry the 499-A obligation...'
- Add prominent 'Who this is for - and who it isn't' section: legitimate
conversational voice (UCaaS/PBX/business/residential/live-agent) yes;
short-duration/dialer/robocall-evasion no. States upstreams are fully
STIR/SHAKEN compliant and we don't onboard traffic designed to evade
caller-ID auth; notes Canadian carriers police ASR/ACD more strictly than
anywhere (a feature). HTML validated balanced.
Reframe away from 'escape the FCC' optics that would draw enforcement attention:
- Header/flagbar: 'Move your VoIP home to Canada' / 'US obligations ride on your
upstream' (was 'no FCC reporting, no USAC, no S/S to run')
- Recast claims to 'CRTC regulatory home, not FCC' and scope the no-USF/no-499/
no-RMD claims to the Canadian-jurisdiction traffic (accurate for US-number
traffic, which rides on the compliant US upstream)
- STIR/SHAKEN bullet now explicitly pro-compliance: 'we don't help anyone dodge
call-authentication; upstream partners are fully S/S compliant'
- Drop 'outside the FCC's reach'
- Add honest caveat: Canada is not for short-duration/dialer traffic; Canadian
carriers are more stringent on ACD/ASR than anywhere; this is for real
conversational voice (UCaaS/PBX/business/residential/live-agent)