healthcare: fix 4 bugs in segment-assignment + free-check email
Found during a bug-review pass of the one-email-per-provider work:
1. assign_all overwrite bug: an email on MULTIPLE rows (shared practice inbox /
multiple NPIs -- 2,592 such emails, 299 with mixed status) was assigned by
the LAST row, so a less-urgent row could clobber an urgent one (overdue ->
free check). Now keeps the most-urgent (lowest-priority) assignment.
2. warm_segment double-import + wrong-row render: all of an email's rows passed
the candidate filter, so it could be imported twice (over-counting the slice)
and attribs_for could render a sibling row's blank due-date in the overdue
email. Now requires row_matches(seg) for the specific row AND dedupes by
email (one row per email).
3. free-check email rendered broken text ('last updated on -- about years
ago', 'Last updated . ~ yrs ago') for any provider whose NPPES date isn't
cached yet (the free check goes to everyone, and the fill is gradual). Wrapped
the example sentence + official-record card in listmonk {{ if
.nppes_last_updated }}...{{ else }}...{{ end }}; added a date-free else
branch. altbody keeps the conditionals (listmonk evaluates body+altbody), and
the test/preview renderer gained a minimal {{ if/else/end }} evaluator so
previews match real sends. Verified both branches render with zero unfilled
tokens.
4. cross-cron double-send: pw-hc-campaign (warmup file) and pw-hc-nppes (63k
file) share state but tracked imports per-segment; 312 emails overlap both
files, so a provider could get an urgent email from one cron AND the free
check from the other. Added load_all_imported() global guard (union of all
segment state) so each provider gets exactly one healthcare email overall.
All verified: assignment regression test (10 cases) + new dup-email/guard checks
pass; all 6 templates render clean.
This commit is contained in:
parent
0320dc17ba
commit
1acae2f20c
3 changed files with 92 additions and 8 deletions
|
|
@ -14,7 +14,7 @@
|
|||
<tr><td class="pw-pad" style="padding:28px;font-family:Inter,system-ui,sans-serif;color:#1f2937;">
|
||||
<p style="font-size:15px;margin:0 0 18px;line-height:1.5;">Hi {{ .Subscriber.Name }},</p>
|
||||
<h2 style="font-size:19px;margin:0 0 14px;color:#0f172a;line-height:1.3;">We pulled the public records for NPI {{ .Subscriber.Attribs.npi }} — here’s a free check</h2>
|
||||
<p style="font-size:14px;line-height:1.7;margin:0 0 18px;">As a quick example, the public NPPES NPI Registry shows the record for <strong>{{ .Subscriber.Attribs.practice }}</strong> was <strong>last updated on {{ .Subscriber.Attribs.nppes_last_updated }}</strong> — about <strong>{{ .Subscriber.Attribs.nppes_years_stale }} years ago</strong>. That’s usually fine, but it’s only one of several things payers and CMS check. Our free tool runs your NPI against the public government sources in one place — <strong>no signup, no cost</strong> — and tells you exactly where you stand.</p>
|
||||
<p style="font-size:14px;line-height:1.7;margin:0 0 18px;">{{ if .Subscriber.Attribs.nppes_last_updated }}As a quick example, the public NPPES NPI Registry shows the record for <strong>{{ .Subscriber.Attribs.practice }}</strong> was <strong>last updated on {{ .Subscriber.Attribs.nppes_last_updated }}</strong> — about <strong>{{ .Subscriber.Attribs.nppes_years_stale }} years ago</strong>. That’s usually fine, but it’s only one of several things payers and CMS check. {{ else }}Your NPI touches several public government records — NPPES, the Medicare revalidation list, and the federal exclusion lists — and any one of them being off can hold up your payments. {{ end }}Our free tool runs your NPI against those public sources in one place — <strong>no signup, no cost</strong> — and tells you exactly where you stand.</p>
|
||||
|
||||
<table role="presentation" width="100%" cellpadding="0" cellspacing="0" style="margin:22px 0;"><tr><td style="background:#f0fdfa;border:1px solid #99f6e4;border-radius:10px;padding:18px;">
|
||||
<p style="margin:0 0 10px;font-size:14px;color:#0f766e;font-weight:700;">Your free check covers:</p>
|
||||
|
|
@ -31,7 +31,9 @@
|
|||
<div style="font-size:13px;color:#065f46;line-height:1.7;">Payers, clearinghouses, and CMS pull from NPPES. A stale address, taxonomy, or contact can cause <strong>claim denials, mail you never receive, and failed credentialing</strong>. CMS requires you to correct your NPPES record within 30 days of any change.</div>
|
||||
</td></tr></table>
|
||||
|
||||
<!-- Official-record card: NPPES is fully public, so this mirrors the registry. -->
|
||||
<!-- Official-record card: NPPES is fully public, so this mirrors the registry.
|
||||
Only shown when we have the real Last Updated date for this NPI. -->
|
||||
{{ if .Subscriber.Attribs.nppes_last_updated }}
|
||||
<table role="presentation" width="100%" cellpadding="0" cellspacing="0" style="margin:22px 0;">
|
||||
<tr><td style="border:1px solid #cbd5e1;border-radius:10px;overflow:hidden;">
|
||||
<table role="presentation" width="100%" cellpadding="0" cellspacing="0">
|
||||
|
|
@ -49,6 +51,7 @@
|
|||
</table>
|
||||
</td></tr>
|
||||
</table>
|
||||
{{ end }}
|
||||
|
||||
<!-- Free-first reassurance: the check is free; a fix is optional + flat-fee. -->
|
||||
<table role="presentation" width="100%" cellpadding="0" cellspacing="0" style="margin:18px 0;"><tr><td style="background:#f8fafc;border:1px solid #e5e7eb;border-radius:10px;padding:14px 18px;">
|
||||
|
|
|
|||
|
|
@ -140,6 +140,32 @@ def template_path(seg_key: str) -> str:
|
|||
return os.path.join(OUT_DIR, SEGMENTS[seg_key]["template"])
|
||||
|
||||
|
||||
def _eval_conditionals(html: str, attribs: dict) -> str:
|
||||
"""Minimal evaluator for the listmonk/Go `{{ if .Subscriber.Attribs.X }}A
|
||||
{{ else }}B{{ end }}` blocks used in the templates, so TEST/PREVIEW renders
|
||||
match what listmonk produces at send time (listmonk itself evaluates these
|
||||
server-side; this is only for the standalone preview/test-send path). Treats
|
||||
an attribute as truthy when it is present and non-empty. Supports an optional
|
||||
{{ else }} and is non-nested (which is all the templates use)."""
|
||||
import re
|
||||
pat = re.compile(
|
||||
r"\{\{\s*if\s+\.Subscriber\.Attribs\.(\w+)\s*\}\}(.*?)"
|
||||
r"(?:\{\{\s*else\s*\}\}(.*?))?\{\{\s*end\s*\}\}",
|
||||
re.DOTALL,
|
||||
)
|
||||
|
||||
def repl(m: "re.Match") -> str:
|
||||
key, if_body, else_body = m.group(1), m.group(2), m.group(3) or ""
|
||||
return if_body if str(attribs.get(key, "")).strip() else else_body
|
||||
|
||||
# Loop until stable so adjacent/multiple blocks all resolve.
|
||||
prev = None
|
||||
while prev != html:
|
||||
prev = html
|
||||
html = pat.sub(repl, html)
|
||||
return html
|
||||
|
||||
|
||||
def render(seg_key: str, *, test: bool = False) -> tuple[str, str]:
|
||||
"""Return (subject, html) for a segment. The html is the canonical
|
||||
data/hc_campaigns/<template> file -- the single source of truth. For test
|
||||
|
|
@ -148,6 +174,9 @@ def render(seg_key: str, *, test: bool = False) -> tuple[str, str]:
|
|||
s = SEGMENTS[seg_key]
|
||||
html = open(template_path(seg_key)).read()
|
||||
if test:
|
||||
# Resolve {{ if .Subscriber.Attribs.X }} blocks first (listmonk does this
|
||||
# server-side on real sends), using SAMPLE as the attrib source.
|
||||
html = _eval_conditionals(html, SAMPLE)
|
||||
html = (html
|
||||
.replace("{{ .Subscriber.Name }}", SAMPLE["name"])
|
||||
.replace("{{ .Subscriber.Attribs.npi }}", SAMPLE["npi"])
|
||||
|
|
|
|||
|
|
@ -247,6 +247,24 @@ def save_imported(seg_key: str, emails: set[str]):
|
|||
f.write("\n".join(sorted(emails)) + "\n")
|
||||
|
||||
|
||||
def load_all_imported() -> set[str]:
|
||||
"""Union of EVERY segment's imported-emails state, i.e. everyone who has
|
||||
already been emailed by ANY segment. Used as a cross-segment AND cross-cron
|
||||
guard so a provider gets exactly one healthcare email overall: the two crons
|
||||
(pw-hc-campaign on the small warmup file, pw-hc-nppes on the 63k institutional
|
||||
file) share these state files, and ~312 emails overlap both files, so without
|
||||
this a provider warmed as 'revalidation_overdue' by one cron could also be
|
||||
warmed as the free 'nppes_outdated' check by the other. Reads all
|
||||
hc_imported_*.txt plus the legacy single-segment file."""
|
||||
seen: set[str] = set()
|
||||
for key in SEGMENTS:
|
||||
seen |= load_imported(key)
|
||||
legacy = os.path.join(STATE_DIR, "hc_imported_emails.txt")
|
||||
if os.path.exists(legacy):
|
||||
seen |= {ln.strip().lower() for ln in open(legacy) if ln.strip()}
|
||||
return seen
|
||||
|
||||
|
||||
def add_subscriber(list_id: int, email: str, name: str, attribs: dict) -> bool:
|
||||
try:
|
||||
lm("/subscribers", {
|
||||
|
|
@ -410,14 +428,25 @@ def assign_segment(r: dict, active_segments: list[str]) -> str | None:
|
|||
|
||||
def assign_all(rows: list[dict], active_segments: list[str]) -> dict[str, str]:
|
||||
"""Map email -> assigned segment across the whole list, so each segment's
|
||||
importer can claim only its assigned providers. Computed once per run."""
|
||||
importer can claim only its assigned providers. Computed once per run.
|
||||
|
||||
An email can appear on MULTIPLE rows (a shared practice inbox covering
|
||||
several NPIs, e.g. a credentialing address) and those rows can carry
|
||||
DIFFERENT statuses (one NPI overdue, another not on the list). We must keep
|
||||
the MOST-URGENT assignment across all of that email's rows -- otherwise a
|
||||
later, less-urgent row would clobber an earlier urgent one and the provider
|
||||
would get the free check instead of the overdue email. So we compare
|
||||
priorities and keep the winner (lower number = more urgent)."""
|
||||
out: dict[str, str] = {}
|
||||
for r in rows:
|
||||
email = (r.get("email") or "").strip().lower()
|
||||
if not email:
|
||||
continue
|
||||
seg = assign_segment(r, active_segments)
|
||||
if seg is not None:
|
||||
if seg is None:
|
||||
continue
|
||||
prev = out.get(email)
|
||||
if prev is None or _seg_priority(seg) < _seg_priority(prev):
|
||||
out[email] = seg
|
||||
return out
|
||||
|
||||
|
|
@ -472,24 +501,47 @@ def warm_segment(seg_key: str, rows: list[dict], slice_n: int,
|
|||
keeps working unchanged."""
|
||||
seg = SEGMENTS[seg_key]
|
||||
imported = load_imported(seg_key)
|
||||
# Cross-segment + cross-cron guard: skip anyone already emailed by ANY
|
||||
# segment so each provider gets exactly one healthcare email overall.
|
||||
already_anywhere = load_all_imported()
|
||||
suppressed = load_suppressed()
|
||||
|
||||
def _is_candidate(r: dict) -> bool:
|
||||
email = r.get("email", "").strip().lower()
|
||||
if not email or email in imported or email in suppressed:
|
||||
if not email or email in already_anywhere or email in suppressed:
|
||||
return False
|
||||
if _is_google_hosted(r):
|
||||
return False
|
||||
if assignment is not None:
|
||||
return assignment.get(email) == seg_key
|
||||
# The email must be assigned to THIS segment AND this specific row
|
||||
# must be the one that earns it. An email can span several rows (a
|
||||
# shared practice inbox over multiple NPIs); only the row whose own
|
||||
# status matches this segment's selector should represent it, so the
|
||||
# template renders that row's real data (e.g. the overdue NPI's due
|
||||
# date, never a sibling 'not_on_list' row's blank one). This also
|
||||
# dedupes: at most one row per email passes.
|
||||
return assignment.get(email) == seg_key and row_matches(seg_key, r)
|
||||
return row_matches(seg_key, r)
|
||||
|
||||
candidates = [r for r in rows if _is_candidate(r)]
|
||||
# Dedupe by email: an email can legitimately appear on multiple matching
|
||||
# rows (e.g. two overdue NPIs share one inbox). Keep the first so the email
|
||||
# is imported once and counted once against the slice budget.
|
||||
candidates = []
|
||||
seen_emails: set[str] = set()
|
||||
for r in rows:
|
||||
if not _is_candidate(r):
|
||||
continue
|
||||
email = r["email"].strip().lower()
|
||||
if email in seen_emails:
|
||||
continue
|
||||
seen_emails.add(email)
|
||||
candidates.append(r)
|
||||
# Spread the slice across MX operators so no single receiving system (e.g.
|
||||
# Microsoft 365) gets the whole batch. Caps ramp with the warmup day.
|
||||
todo = mx_throttled(candidates, slice_n, mx_daily_caps(warmup_day()))
|
||||
print(f"[hc-cron] {seg_key}: candidates={len(candidates)} "
|
||||
f"already={len(imported)} to_import={len(todo)}")
|
||||
f"in_segment={len(imported)} emailed_anywhere={len(already_anywhere)} "
|
||||
f"to_import={len(todo)}")
|
||||
|
||||
if dry_run:
|
||||
for r in todo[:3]:
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue