new-site/scripts/build_healthcare_campaigns.py
justin 1acae2f20c healthcare: fix 4 bugs in segment-assignment + free-check email
Found during a bug-review pass of the one-email-per-provider work:

1. assign_all overwrite bug: an email on MULTIPLE rows (shared practice inbox /
   multiple NPIs -- 2,592 such emails, 299 with mixed status) was assigned by
   the LAST row, so a less-urgent row could clobber an urgent one (overdue ->
   free check). Now keeps the most-urgent (lowest-priority) assignment.

2. warm_segment double-import + wrong-row render: all of an email's rows passed
   the candidate filter, so it could be imported twice (over-counting the slice)
   and attribs_for could render a sibling row's blank due-date in the overdue
   email. Now requires row_matches(seg) for the specific row AND dedupes by
   email (one row per email).

3. free-check email rendered broken text ('last updated on  -- about  years
   ago', 'Last updated  . ~ yrs ago') for any provider whose NPPES date isn't
   cached yet (the free check goes to everyone, and the fill is gradual). Wrapped
   the example sentence + official-record card in listmonk {{ if
   .nppes_last_updated }}...{{ else }}...{{ end }}; added a date-free else
   branch. altbody keeps the conditionals (listmonk evaluates body+altbody), and
   the test/preview renderer gained a minimal {{ if/else/end }} evaluator so
   previews match real sends. Verified both branches render with zero unfilled
   tokens.

4. cross-cron double-send: pw-hc-campaign (warmup file) and pw-hc-nppes (63k
   file) share state but tracked imports per-segment; 312 emails overlap both
   files, so a provider could get an urgent email from one cron AND the free
   check from the other. Added load_all_imported() global guard (union of all
   segment state) so each provider gets exactly one healthcare email overall.

All verified: assignment regression test (10 cases) + new dup-email/guard checks
pass; all 6 templates render clean.
2026-06-20 16:14:44 -05:00

241 lines
11 KiB
Python

#!/usr/bin/env python3
"""Healthcare (NPI/Medicare) marketing-email SEGMENT REGISTRY + test tooling.
SINGLE SOURCE OF TRUTH for the healthcare campaign segments. Each segment maps a
compliance problem to a real PW service, its order page, price, the listmonk
list/campaign it warms, and the canonical HTML template under data/hc_campaigns/.
The HTML bodies themselves are the hand-tuned, deployed templates in
data/hc_campaigns/hc_<seg>.html (teal header, per-segment "verify it yourself"
trust block, official-record card on revalidation, etc.). This module does NOT
regenerate them -- it READS them, so the files stay the one source of truth and
can't drift from a parallel generator. (An earlier version of this script kept a
divergent inline generator; that was removed.)
Consumers:
* build_healthcare_campaigns_cron.py imports SEGMENTS to warm every segment.
* `--send-test <email>` sends every segment as a real test through the
healthcare HOT SMTP stream (host :2526 -> hcout1 -> .107) so you see exactly
what a provider receives. Personalization tokens are filled with sample data.
Listmonk personalization tokens used on real sends (filled from subscriber
attribs by listmonk; filled from SAMPLE here for test sends):
{{ .Subscriber.Name }} provider / practice name
{{ .Subscriber.Attribs.npi }} NPI
{{ .Subscriber.Attribs.practice }} practice / org name
{{ .Subscriber.Attribs.detail }} segment-specific detail (e.g. due date)
{{ .Subscriber.Attribs.reval_due_date }} / .days_overdue (revalidation card)
{{ UnsubscribeURL }} listmonk per-subscriber unsubscribe
"""
from __future__ import annotations
import argparse, os, smtplib, ssl
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
from email.utils import formataddr, make_msgid
SITE = "https://performancewest.net"
PHONE = "(888) 411-0383"
FROM_NAME = "Performance West Compliance"
FROM_EMAIL = "compliance@performancewest.net"
REPLY_TO = "info@performancewest.net"
OUT_DIR = os.path.join(os.path.dirname(__file__), "..", "data", "hc_campaigns")
# ── Per-segment registry ───────────────────────────────────────────────────
# Metadata only. The email body lives in OUT_DIR/<template>. Fields:
# subject listmonk campaign subject line
# template HTML file under data/hc_campaigns/ (the canonical body)
# cta_path order page the CTA links to (NPI appended as ?npi=)
# price reference price only (catalog in api/src/service-catalog.ts is
# the source of truth). NOT shown in the email anymore — price is
# revealed on the order page after the value is established.
# list_name listmonk-hc list this segment is warmed into
# campaign_name listmonk-hc campaign name prefix (dated per build)
# selector which warmup-CSV rows belong to this segment (see cron)
# priority URGENCY rank for one-email-per-provider assignment (LOWER =
# more urgent, wins). A provider is warmed into exactly ONE
# segment: the highest-priority (lowest number) active segment
# whose selector matches their row. So a provider who is BOTH
# revalidation-overdue and NPPES-stale gets the overdue email
# (more important + time-sensitive), not the generic free check.
# The free NPI check is the catch-all default at the highest
# number, so everyone with no more-urgent issue still gets it.
SEGMENTS = {
"revalidation_overdue": {
"subject": "Your Medicare revalidation is past due - let's get it filed",
"template": "hc_revalidation_overdue_personal.html",
"cta_path": "/order/npi-revalidation",
"price": "$599",
"list_name": "HC Warmup - Revalidation Overdue",
"campaign_name": "HC Warmup - Medicare Revalidation",
"selector": "reval_overdue",
"priority": 20,
},
"revalidation_due_soon": {
"subject": "Let's make sure your Medicare revalidation is handled in time",
"template": "hc_revalidation_personal.html",
"cta_path": "/order/npi-revalidation",
"price": "$599",
"list_name": "HC Warmup - Revalidation Due Soon",
"campaign_name": "HC Warmup - Revalidation Due Soon",
"selector": "reval_due_soon",
"priority": 30,
},
"npi_reactivation": {
"subject": "Your NPI / Medicare enrollment appears deactivated",
"template": "hc_npi_reactivation.html",
"cta_path": "/order/npi-reactivation",
"price": "$449",
"list_name": "HC Warmup - Reactivation",
"campaign_name": "HC Warmup - NPI Reactivation",
"selector": "leie_or_deactivated",
"priority": 10,
},
"nppes_outdated": {
"subject": "A free compliance check for your NPI",
"template": "hc_nppes_outdated.html",
"cta_path": "/tools/npi-compliance-check",
"price": "$349",
"list_name": "HC Warmup - Free NPI Check",
"campaign_name": "HC Warmup - Free NPI Check",
"selector": "institutional_default",
"priority": 100,
},
"oig_screening": {
"subject": "Are you screening for OIG / SAM exclusions?",
"template": "hc_oig_screening.html",
"cta_path": "/order/oig-sam-screening",
"price": "$79/mo",
"list_name": "HC Warmup - OIG Screening",
"campaign_name": "HC Warmup - OIG Screening",
"selector": "institutional_verified",
"priority": 40,
},
"compliance_bundle": {
"subject": "Get your provider compliance handled for the year",
"template": "hc_compliance_bundle.html",
"cta_path": "/order/provider-compliance-bundle",
"price": "$899/yr",
"list_name": "HC Warmup - Compliance Bundle",
"campaign_name": "HC Warmup - Compliance Bundle",
"selector": "optout_ending",
"priority": 45,
},
}
# Sample values for test sends (real sends use Listmonk subscriber attribs).
SAMPLE = {
"name": "Dr. Sample Provider",
"practice": "Riverbend Family Medicine",
"npi": "1234567890",
"detail": "06/30/2024 (706 days overdue)",
"reval_due_date": "06/30/2024",
"days_overdue": "706",
"nppes_last_updated": "2012-02-08",
"nppes_years_stale": "14",
"nppes_enumeration": "2011-04-06",
}
def template_path(seg_key: str) -> str:
return os.path.join(OUT_DIR, SEGMENTS[seg_key]["template"])
def _eval_conditionals(html: str, attribs: dict) -> str:
"""Minimal evaluator for the listmonk/Go `{{ if .Subscriber.Attribs.X }}A
{{ else }}B{{ end }}` blocks used in the templates, so TEST/PREVIEW renders
match what listmonk produces at send time (listmonk itself evaluates these
server-side; this is only for the standalone preview/test-send path). Treats
an attribute as truthy when it is present and non-empty. Supports an optional
{{ else }} and is non-nested (which is all the templates use)."""
import re
pat = re.compile(
r"\{\{\s*if\s+\.Subscriber\.Attribs\.(\w+)\s*\}\}(.*?)"
r"(?:\{\{\s*else\s*\}\}(.*?))?\{\{\s*end\s*\}\}",
re.DOTALL,
)
def repl(m: "re.Match") -> str:
key, if_body, else_body = m.group(1), m.group(2), m.group(3) or ""
return if_body if str(attribs.get(key, "")).strip() else else_body
# Loop until stable so adjacent/multiple blocks all resolve.
prev = None
while prev != html:
prev = html
html = pat.sub(repl, html)
return html
def render(seg_key: str, *, test: bool = False) -> tuple[str, str]:
"""Return (subject, html) for a segment. The html is the canonical
data/hc_campaigns/<template> file -- the single source of truth. For test
sends, listmonk tokens are filled with SAMPLE data so the email is viewable
standalone."""
s = SEGMENTS[seg_key]
html = open(template_path(seg_key)).read()
if test:
# Resolve {{ if .Subscriber.Attribs.X }} blocks first (listmonk does this
# server-side on real sends), using SAMPLE as the attrib source.
html = _eval_conditionals(html, SAMPLE)
html = (html
.replace("{{ .Subscriber.Name }}", SAMPLE["name"])
.replace("{{ .Subscriber.Attribs.npi }}", SAMPLE["npi"])
.replace("{{ .Subscriber.Attribs.practice }}", SAMPLE["practice"])
.replace("{{ .Subscriber.Attribs.detail }}", SAMPLE["detail"])
.replace("{{ .Subscriber.Attribs.reval_due_date }}", SAMPLE["reval_due_date"])
.replace("{{ .Subscriber.Attribs.days_overdue }}", SAMPLE["days_overdue"])
.replace("{{ .Subscriber.Attribs.nppes_last_updated }}", SAMPLE["nppes_last_updated"])
.replace("{{ .Subscriber.Attribs.nppes_years_stale }}", SAMPLE["nppes_years_stale"])
.replace("{{ .Subscriber.Attribs.nppes_enumeration }}", SAMPLE["nppes_enumeration"])
.replace("{{ UnsubscribeURL }}", f"{SITE}/unsubscribe?test=1"))
return s["subject"], html
def cmd_send_test(to_addr: str, host: str, port: int):
from email.utils import formatdate
n = 0
for key in SEGMENTS:
subj, html = render(key, test=True)
msg = MIMEMultipart("alternative")
msg["Subject"] = f"[TEST] {subj}"
msg["From"] = formataddr((FROM_NAME, FROM_EMAIL))
msg["To"] = to_addr
msg["Reply-To"] = REPLY_TO
msg["Date"] = formatdate(localtime=True)
msg["Message-ID"] = make_msgid(domain="performancewest.net")
# Bulk-mail deliverability headers (Gmail/GMX strongly reward these).
msg["List-Unsubscribe"] = (
f"<mailto:unsubscribe@performancewest.net?subject=unsubscribe>, "
f"<{SITE}/unsubscribe?e={to_addr}>")
msg["List-Unsubscribe-Post"] = "List-Unsubscribe=One-Click"
msg["List-Id"] = "Performance West Healthcare Compliance <hc.performancewest.net>"
msg["Precedence"] = "bulk"
msg["X-Entity-Ref-ID"] = make_msgid(domain="performancewest.net")
msg.attach(MIMEText("Please view this email in HTML.", "plain"))
msg.attach(MIMEText(html, "html"))
with smtplib.SMTP(host, port, timeout=15) as s:
s.ehlo("hcmta01.performancewest.net")
s.sendmail(FROM_EMAIL, [to_addr], msg.as_string())
print(f" sent [{key}] -> {to_addr} ({subj})")
n += 1
print(f"done: {n} test emails sent via {host}:{port}")
def main():
ap = argparse.ArgumentParser()
ap.add_argument("--send-test", metavar="EMAIL", help="send all segments as test to EMAIL")
ap.add_argument("--list", action="store_true", help="list the segment registry")
ap.add_argument("--smtp-host", default="127.0.0.1")
ap.add_argument("--smtp-port", type=int, default=2526, help="hc HOT submission port")
args = ap.parse_args()
if args.list:
for k, s in SEGMENTS.items():
print(f"{k:22} {s['price']:>8} {s['template']:30} -> {s['campaign_name']}")
if args.send_test:
cmd_send_test(args.send_test, args.smtp_host, args.smtp_port)
if not args.send_test and not args.list:
ap.print_help()
if __name__ == "__main__":
main()