new-site

Author	SHA1	Message	Date
justin	8e5590b492	mail: DMARC aggregate-report parser + dedicated dmarc@ mailbox ingestion Tool 2 of the deliverability monitoring pair (Tool 1 = mail_reputation_monitor). DMARC rua reports from dozens of operators (Google, Yahoo, Comcast, Cox, Bell, Mimecast, Cisco ESA, GMX, mail.com, ...) were landing in ops@ (dmarc@ was a DL), burying real mail and never parsed. Now ingested + queryable: - dmarc@performancewest.net converted DL -> dedicated Carbonio mailbox; isolated IMAP creds in server .env, surfaced to workers in docker-compose.yml (mirrors OPS_IMAP_*). 29 historical reports moved ops@ -> dmarc@ via IMAP. - scripts/dmarc_report_parser.py: IMAP fetch unseen -> decompress .gz/.zip/.xml (namespace-agnostic: classic + urn:ietf:params:xml:ns:dmarc-2.0 GMX/mail.com) -> parse aggregate XML -> upsert dmarc_report (keyed (org_name,report_id), no-op on re-parse) + dmarc_record per source IP. dmarc_pass = dkim_aligned OR spf_aligned. Marks \Seen. --dry-run/--all/--alert (7d per-IP summary + Telegram if one of OUR IPs <95% pass, or EXTERNAL IP sends >=20 failing msgs as us = spoofing under p=reject). psycopg2 imported lazily so --dry-run runs without the driver. - api/migrations/102_dmarc_aggregate.sql: dmarc_report + dmarc_record tables. - infra/cron/pw-dmarc-parser: 06:20 UTC daily --alert (after reputation, before scrub). - docs/deliverability.md: DMARC section DONE; query examples. Verified: dry-run --all parses all 28 reports (1 non-report test probe), 0 unknown after the namespace fix.	2026-06-19 08:50:20 -05:00
justin	b45332b5f7	infra(cron): nightly mail-reputation snapshot (pw-mail-reputation) Runs mail_reputation_monitor --alert at 06:10 UTC, piping the day's postfix log (sudo cat, same pattern as pw-warmup-tg-alert) into the DB-connected workers container. Builds the daily SNDS-equivalent reputation trend and Telegram-alerts on operator regressions. Installed to /etc/cron.d/pw-mail-reputation.	2026-06-19 08:38:35 -05:00
justin	08f651dc1e	feat(deliverability): mail reputation monitor (SNDS-equivalent from postfix logs) Adds scripts/mail_reputation_monitor.py + migration 101 (mail_reputation_daily). Sender reputation is judged by the RECEIVING operator (Microsoft/Google/Yahoo/ Proofpoint), and the provider portals (SNDS/Postmaster/CFL) need a login and lag 24-48h. Our postfix logs already carry the ground truth in real time: every send records the receiving host + SMTP response, and the response classifies WHY: 250 -> accepted 451 4.7.500 -> throttled (Microsoft rate-limiting a cold IP) 550 5.7.x -> reject_reputation (spam/reputation) 550 5.1.1/5.4.1-> reject_recipient (dead mailbox / access denied = list hygiene) 550 ...SPAM -> reject_content (SpamAssassin) The parser classifies each egress delivery (out0x/hcout/relay) by (sending_ip, receiver, outcome, reason_code) and upserts ONE daily aggregate row per bucket (idempotent ON CONFLICT), so a nightly cron over the rotated log gives a queryable trend without re-parse double-counting. --alert prints a per-operator summary and Telegram-alerts on regressions (>=10% reputation rejects, or Microsoft >=70% throttled). Reads stdin ("-") so the host-owned /var/log/mail.log can be piped into the DB-connected workers container. Motivation: 2026-06-19 audit found ~80% of Microsoft sends were getting 451 4.7.500 throttles on the warming IPs -- this makes that trend visible as reputation recovers.	2026-06-19 08:35:45 -05:00
justin	bd7ba23841	docs(deliverability): Yahoo CFL ENROLLED for both domains (reporting fbl@) performancewest.net + send.performancewest.net both show Enrolled in the Yahoo Sender Hub, reporting email fbl@. All three FBLs (Google Postmaster, MS SNDS+JMRP, Yahoo CFL) now complete.	2026-06-19 08:29:12 -05:00
justin	b8b6444084	docs(deliverability): Yahoo CFL verification keys added for both domains Added yahoo-verification-key TXT records via Hestia for performancewest.net (apex) and send.performancewest.net; both propagated to all HE.net slaves + public resolvers. Ready to click Verify in the Yahoo CFL form, complaint dest fbl@.	2026-06-19 02:13:48 -05:00
justin	a9bbfbf59b	docs(deliverability): Microsoft MANUAL 2 fully DONE — SNDS access + JMRP both set SNDS access requested/granted for 207.174.124.94 + .107; JMRP feeds registered with complaint dest fbl@. Section marked complete. SNDS data populates in ~24-48h.	2026-06-19 02:03:30 -05:00
justin	f293466519	docs(deliverability): JMRP complaint dest set to fbl@performancewest.net Corrected: JMRP feed destination was set to fbl@ directly (no forward needed); ARF complaints route to ops@.	2026-06-19 01:00:16 -05:00
justin	60540f949d	docs(deliverability): JMRP done — both IPs registered (pw1/.94, pw2/.107) Note JMRP delivers ARF complaints to the signed-in MS account's email, not automatically to fbl@; set a forward if that account isn't fbl@performancewest.net.	2026-06-19 00:59:49 -05:00
justin	776817c727	docs(deliverability): correct SNDS entry URL (snds.microsoft.com does not resolve) Use the legacy sendersupport.olc.protection.outlook.com/snds/ (308-redirects) or the direct substrate.office.com/ip-domain-management-snds/SNDS app URL. Flag that snds.microsoft.com has no DNS.	2026-06-19 00:46:25 -05:00
justin	7828ee4587	docs(deliverability): fix SNDS/JMRP URLs for Microsoft's 2026 substrate migration SNDS moved off sendersupport.olc.protection.outlook.com to substrate.office.com/ip-domain-management-snds/. The old /snds/ and /pm/ links 308-redirect there. Document that the footer/help links going to microsoft.com are boilerplate (not broken), and that you must Log in FIRST or the Request Access / JMRP links bounce to login.microsoftonline.com (expected, not dead). Add working direct links + canonical https://snds.microsoft.com entry point.	2026-06-19 00:45:59 -05:00
justin	e18f23634a	docs(deliverability): document consumer-domain exclusion two-layer model + scrub Records the Apple/iCloud addition, the builder-vs-list-based distinction, the scrub_listmonk_consumer reconciliation tool + daily cron, and the 2026-06-19 first-run numbers (7,943 trucking + 21 HC stale consumer subs blocklisted).	2026-06-19 00:01:17 -05:00
justin	72c69a05c9	infra(cron): daily Listmonk consumer-domain reconciliation (pw-listmonk-scrub) Runs scrub_listmonk_consumer against both listmonk and listmonk_hc at 06:30 UTC, before the campaign builders, so any ENABLED subscriber matching the authoritative exclusion list is blocklisted retroactively. Keeps list-based campaigns (FCC Direct Contacts, CRTC/USF, etc.) from leaking onto consumer mailboxes after a new domain (e.g. Apple/iCloud) is added to the exclusion list. Installed to /etc/cron.d/pw-listmonk-scrub on the host.	2026-06-19 00:00:46 -05:00
justin	b40fc7ec36	feat(deliverability): exclude Apple consumer mail + scrub stale consumer subs from Listmonk The fmcsa campaign builders already exclude gmail/yahoo/microsoft/etc. from NEW audience selections, but two reputation leaks remained on the LIST-BASED side: 1. iCloud/Apple gap. icloud.com/me.com/mac.com were never in the exclusion set. A 2026-06 Listmonk audit found 1,321 ENABLED iCloud subscribers on list 3 ("FCC Carriers - Direct Contacts") -- the single largest enabled-consumer bucket -- being cold-blasted with no exclusion at all. Add APPLE_CONSUMER_DOMAINS. 2. Stale already-imported consumer subs. List-based campaigns (e.g. the running CRTC/USF blast on list 3) keep hitting consumer addresses imported BEFORE the relevant domain joined the exclusion list. gmail.com was still the #1 bounce domain via that campaign even though new selections exclude it. Add scrub_listmonk_consumer.py: reconciles the live Listmonk subscriber table against the authoritative exclusion list and blocklists any ENABLED subscriber whose address is_blocked(). Idempotent; re-run whenever the exclusion grows so it applies retroactively. Uses the same 'blocklisted' terminal state as the bounce handler, so contacts are excluded from all current/future campaigns without deleting history. Supports --dry-run and both listmonk / listmonk_hc.	2026-06-18 23:55:58 -05:00
justin	49842bddbb	docs(deliverability): Microsoft #1 priority + role mailboxes created (Carbonio) Created postmaster@/abuse@/fbl@/dmarc@ as Carbonio DLs -> ops@ (they previously REJECTED 5.1.1, which would have blocked SNDS verification AND was silently dropping all DMARC aggregate reports). Verified accept-at-MX + delivered E2E. Reframe Microsoft as the #1 monitoring priority (85% of audience), Yahoo as lowest (<1%); add Carbonio admin access note; note DMARC parser now worth building.	2026-06-18 23:31:20 -05:00
justin	3ca960aca5	docs+infra(deliverability): document bulk subdomain; ansible signs send.performancewest.net - infra/ansible/roles/mail: refactor OpenDKIM to support multiple signing domains via opendkim_signing_domains list (root + send.performancewest.net). Loops keygen/ownership/keytable/signingtable so the live two-domain setup is reproducible from ansible. - infra/ansible group_vars: add bulk_mail_subdomain + campaign_from_* + campaign_reply_to documentation vars (map to CAMPAIGN_FROM / HC_CAMPAIGN_FROM env read by the builder scripts). smtp_from (transactional) stays on root. - docs/deliverability.md: rewrite TL;DR with the carrierone-vs-performancewest A/B proof (same server/IPs, different From domain -> Inbox vs Junk) and the ~85% Microsoft / 14% Google / <1% Yahoo audience mix; add the bulk-subdomain section, SPF trim, rehab-disabled, and the Hestia DNS automation runbook.	2026-06-18 23:12:05 -05:00
justin	5c3b4291e7	feat(deliverability): send bulk campaigns from dedicated subdomain send.performancewest.net Isolates bulk sending reputation onto a dedicated subdomain so the root domain stays clean for transactional/verification mail (and recovers faster). Replies still go to the root domain via Reply-To, so the customer-facing reply experience is unchanged. - build_trucking_campaigns.py: add env-overridable FROM_EMAIL (noreply@send.performancewest.net); use it for both scheduled + test sends instead of inheriting base["from_email"] from the DB base campaign. - build_healthcare_campaigns_cron.py: FROM_EMAIL -> compliance@send.performancewest.net (env-overridable). - bounce-watcher.sh / hc-bounce-watcher.sh: track the new subdomain envelope sender (keep legacy root-domain sender so the pre-cutover queue still drains; HC also tracks by hcout transport regardless of sender). Infra already live (separate, non-code): subdomain DNS (A/MX/SPF/DKIM selector=send/DMARC p=reject) on the Hestia master, OpenDKIM signs d=send.performancewest.net (verified end-to-end), egress .94/.107. Root SPF trimmed to the real IPs; pointless IP-rehab cron disabled.	2026-06-18 23:07:23 -05:00
justin	1056705cf9	docs(deliverability): Google Postmaster TXT added+verified via Hestia DNS master DNS is fully automatable: Hestia (cp.carrierone.com, zone owner = justin user) is the DNS master, HE.net are slaves. Added google-site-verification TXT (id 14464) via v-add-dns-record as root; verified resolving on public resolvers + HE.net slaves. Owner just clicks Verify in the Postmaster console. Documents the v-add-dns-record path for future records.	2026-06-18 22:05:01 -05:00
justin	5253f16675	docs: deliverability runbook (incident, IP consolidation, monitoring setup) Documents the 2026-06-18 reputation incident (snowshoe -> Gmail domain-rep blocks, RBLs all clean), the single-IP-per-stream consolidation, and fill-in-the-blanks setup steps for Google Postmaster Tools, Microsoft SNDS/JMRP, and Yahoo CFL (all require owner account login + HE.net DNS). Plus ongoing hygiene + how to re-expand IPs once reputation recovers.	2026-06-18 17:46:28 -05:00
justin	545e6f7ed7	infra(mail): consolidate sending IPs (kill snowshoe) now that DKIM is fixed The multi-IP rotation was built to spread risk while DKIM was broken (fixed 2026-06-17) and after the May 30-31 over-volume blast. With DKIM signing correctly, spreading ~3k trucking msgs/day across 12 IPs (.94-.105) + ~1.2k healthcare msgs/day across 3 IPs (.107-.109) gave each IP far too little per-receiver volume to build reputation. Gmail/Outlook read it as snowshoe spam and reputation-blocked ~200 msgs/day ("very low reputation of the sending domain") -> 0 human clicks, 0 sales. Consolidate to ONE IP per stream so each accrues real reputation: - trucking: pw-mta-warmup ALL=(out05) -> randmap collapses to {out05:} = .94 - healthcare: listmonk-hc SMTP servers 2/3 (ports 2527/2528 -> .108/.109) disabled in DB; all HC mail now egresses .107 (hcmta01). [applied live] Applied live: transport_maps now randmap:{out05:}; listmonk-hc restarted. To re-expand later: add transports back to ALL + re-enable the HC SMTP servers.	2026-06-18 17:41:07 -05:00
justin	f43957882f	docs(billing): record OIG/SAM recurring validation status Checkout half proven against live Stripe (dry-run session created + expired, zero charge), webhook subscription-id extraction + worker renewal fulfillment covered by unit tests (31 + 13). Remaining gap: full E2E with a Stripe test clock, which needs test-mode keys in the server .env (currently unset).	2026-06-18 09:38:51 -05:00
justin	5c1f239307	test(workers): NPI recurring-cycle fulfillment path (13 assertions) Runs the real _BaseNPIHandler.handle() with _create_todo monkeypatched (no DB / ERPNext / email side effects) and asserts: - first OIG/SAM screening has no [Monthly cycle] prefix / RECURRING banner - a recurring_cycle order gets the [Monthly cycle] title prefix, the "RECURRING MONTHLY CYCLE" banner, the invoice id, and the re-run-against- CURRENT-data + issue-NEW-certificate instructions - recurring_cycle works with and without an invoice id - the bundle handler's first run is not flagged recurring Verified passing both locally and inside the deployed workers container.	2026-06-18 09:38:26 -05:00
justin	0083bc1354	docs(billing): record Stripe subscription webhook events as ENABLED + api-version caveat The 3 subscription-lifecycle events (invoice.paid, invoice.payment_failed, customer.subscription.deleted) are now enabled on the live endpoint we_1THBjyB46qMvF2jnYyN8IfkK (6 events total). Documents the unpinned-endpoint api_version caveat (account default 2024-12-18.acacia, not the SDK's dahlia) and why invoiceSubscriptionId() must read both invoice shapes. Notes that charge.dispute.created / balance.available are handled in code but not yet enabled on the endpoint.	2026-06-18 08:45:22 -05:00
justin	8af2685d07	fix(webhooks): read invoice.subscription in both API shapes (acacia + dahlia) The live Stripe webhook endpoint has NO pinned api_version, so it follows the account default (currently 2024-12-18.acacia), which delivers the subscription link as the top-level invoice.subscription. The code only read the new 2026-03-25.dahlia shape (invoice.parent.subscription_details.subscription), so recurring renewal/payment-failed events would have returned a null subscription id and silently failed to fulfill once the events were enabled. invoiceSubscriptionId() now reads the modern shape first, then falls back to the legacy top-level field. All other invoice fields used by the handlers (amount_due, attempt_count, hosted_invoice_url, id) are stable across both versions. +5 tests (legacy string/object, modern-preferred-over-legacy).	2026-06-18 08:42:29 -05:00
justin	cf021e2f91	feat(healthcare): OIG/SAM exclusion screening as $79/mo Stripe Subscription Convert OIG/SAM from one-time $299/yr to recurring $79/month (card+ACH only) - the first real recurring-billing product in the system. Exclusion screening is a monthly federal obligation, so recurring monitoring fits the requirement and is the biggest valuation lever (vs a one-time annual run). Catalog (single source of truth): - service-catalog.ts: add billing_interval + allowed_methods to ComplianceService; oig-sam-screening -> 7900c, billing_interval:"month", allowed_methods:[card,ach], name "(Monthly Monitoring)". - gen-service-catalog.py + check-service-catalog-drift.py: carry/guard the two new fields; regenerate site catalog. Checkout (api/src/routes/checkout.ts): - mode:"subscription" with recurring price_data when billing_interval is set; surcharge absorbed for recurring (clean $79/mo); server-side METHOD_NOT_ALLOWED re-validation against allowed_methods. - ensureColumns + migration 100: compliance_orders.stripe_subscription_id, bundle_upsell_sent_at (+ subscription index). Webhooks (api/src/routes/webhooks.ts): - record stripe_subscription_id on checkout.session.completed (subscription mode). - invoice.paid (subscription_cycle only) -> re-dispatch screening for the cycle; invoice.payment_failed -> admin alert + first-failure customer nudge; customer.subscription.deleted -> mark order cancelled. (API 2026-03-25 moved the subscription link to invoice.parent.subscription_details.subscription.) Fulfillment: - job_server.py: pass recurring_cycle/invoice_id into the order. - npi_provider.py: OIG handler labels renewal cycles "[Monthly cycle]" + re-screen note; bundle action runs only the FIRST screening + flags the $79/mo upsell. Bundle land-and-expand: - Provider Compliance Bundle now includes only the first OIG/SAM screening (was giving away $948/yr of monitoring inside an $899 bundle). - new worker scripts/workers/bundle_upsell.py (+ pw-bundle-upsell timer): ~3 weeks after a paid bundle, emails the customer to continue $79/mo monitoring; dedup via bundle_upsell_sent_at; skips customers who already have an OIG/SAM order. Surfaces updated to $79/mo: PaymentStep (filters methods, "Billed every month, cancel anytime"), order pages, healthcare index, npi-compliance-check tool (also fixed stale $699 bundle drift -> $899), hc_oig_screening + hc_compliance_bundle emails. Docs: billing.md gains a "Stripe-native Subscriptions" section + a reality-check banner (Adyen/ERPNext-gateway model documented there is NOT live; Stripe is the real rail). Fixed run-migrations.yml container name bug (performancewest-postgres-1 -> performancewest-api-postgres-1, overridable). Tests: api/tests/recurring-subscription.test.ts (28 assertions) covers catalog gating, method validation, surcharge suppression, recurring line-item build, invoiceSubscriptionId extraction, renewal-cycle gating. tsc clean; site build clean; catalog drift OK. Manual deploy step: enable invoice.paid, invoice.payment_failed, customer.subscription.deleted on the Stripe webhook endpoint.	2026-06-18 07:54:38 -05:00
justin	f481a1d13c	analytics: filter email-scanner / headless traffic out of Umami stats Email security gateways (Microsoft Defender Safe Links / ATP, Proofpoint, Mimecast, Barracuda, etc.) auto-fetch and often render every link in a campaign email to scan for malware. The advanced ones drive a real headless browser, execute JS, and fire Umami pageviews/clicks that masquerade as human visits -- inflating campaign click-through. New site/public/js/pw-bot-filter.js queries multiple real-browser signals and gates Umami via its official data-before-send hook (umamiBeforeSend), dropping all events when the visitor is a bot. Signals (from empirical chromium probing): decisive: navigator.webdriver, HeadlessChrome UA, known scanner UAs, zero/ collapsed screen\|viewport\|outer geometry, window LARGER than the physical screen (impossible on real HW; uses outerW/H so page zoom does not false-positive), software GPU rasterizer (SwiftShader/ llvmpipe/swrast via WebGL UNMASKED_RENDERER), zero logical CPUs. soft (>=2 to trip): tiny screen, inner>screen, low color depth, empty navigator.languages, no input device (no fine/coarse pointer + no hover + 0 touch), no WebGL on a desktop UA. Designed to FAIL OPEN: only strong/corroborated evidence suppresses, so real visitors (incl. zoomed, privacy-tooled, remote-desktop, kiosk) still count. Wired before the Umami tag in Base.astro (Astro pages) and all 86 static public/*/.html pages; both load with defer so order is guaranteed and the hook is defined before Umami reads it. Tested end-to-end with chromium (site/tests/bot-filter.test.sh, 4/4): default headless-new, spoofed-Windows-UA + normal 1366x768 window, and spoofed-UA + 1x1 window are all caught; hook returns null to drop the event.	2026-06-18 02:02:34 -05:00
justin	40da017b79	campaigns: auto-rollout catch-all pool gated by warmup day + live bounce rate Replaces the panic-era burner-domain verification plan with an in-house automatic catch-all rollout in the trucking/IFTA/UCR builders. Root-cause classification of the 75k pre-DKIM-fix bounces showed ~55% were reputation/ auth (now fixed by DKIM signing) and only ~29% genuinely-dead mailboxes; catch-all domains accept at RCPT time so they do not user-unknown bounce at send, making a controlled in-house bleed safer than warming a separate burner. catch_all_enabled() adds catch-all results only when warmup_day >= CAMPAIGN_CATCH_ALL_MIN_DAY (21) AND the recent 2-day live bounce rate is below CAMPAIGN_CATCH_ALL_MAX_BOUNCE_PCT (8%) on a >=300-sent sample; auto-reverts to the clean smtp_valid/send_confirmed pool on the next run if bounces spike. Short window so a past disaster cannot block the rollout forever and a fresh spike trips fast. CAMPAIGN_INCLUDE_CATCH_ALL=1/0 still hard-overrides. USABLE_FILTER (static) -> usable_filter() (per-run, memoized, one DB probe). IFTA/UCR SELECT_SQL -> _select_sql() so tc.usable_filter() resolves at call time, not import. 13 logic unit tests pass; live dry-run decision = OFF (day 15 < 21 and recent 2d bounce 42% from the aging-out Jun-16 disaster).	2026-06-18 01:39:09 -05:00
justin	c36ef07310	crtc site: defensible framing + 'who this is for' compliance posture Reduce evasion optics that would draw FCC enforcement attention while keeping the real value props: - 'What they avoid by being Canadian' -> 'What the Canadian structure changes' - Drop 'No US telecom taxes on invoices (15-40% saved)' -> Canadian tax treatment on the Canadian entity's billing; 'No US FCC regulatory fees on the Canadian entity' - '...avoid this by routing US traffic...' -> '...instead route US traffic through US intermediaries who carry the 499-A obligation...' - Add prominent 'Who this is for - and who it isn't' section: legitimate conversational voice (UCaaS/PBX/business/residential/live-agent) yes; short-duration/dialer/robocall-evasion no. States upstreams are fully STIR/SHAKEN compliant and we don't onboard traffic designed to evade caller-ID auth; notes Canadian carriers police ASR/ACD more strictly than anywhere (a feature). HTML validated balanced.	2026-06-18 00:22:58 -05:00
justin	720197095c	CRTC USF email: defensible framing + conversational-voice caveat Reframe away from 'escape the FCC' optics that would draw enforcement attention: - Header/flagbar: 'Move your VoIP home to Canada' / 'US obligations ride on your upstream' (was 'no FCC reporting, no USAC, no S/S to run') - Recast claims to 'CRTC regulatory home, not FCC' and scope the no-USF/no-499/ no-RMD claims to the Canadian-jurisdiction traffic (accurate for US-number traffic, which rides on the compliant US upstream) - STIR/SHAKEN bullet now explicitly pro-compliance: 'we don't help anyone dodge call-authentication; upstream partners are fully S/S compliant' - Drop 'outside the FCC's reach' - Add honest caveat: Canada is not for short-duration/dialer traffic; Canadian carriers are more stringent on ACD/ASR than anywhere; this is for real conversational voice (UCaaS/PBX/business/residential/live-agent)	2026-06-18 00:20:44 -05:00
justin	a82b356921	CRTC USF email: reframe to 'run your whole VoIP as a Canadian carrier' Pivot from the hedge/second-entity framing to the consolidation pitch: one CRTC carrier as the home base, nexus in Canada, customers onboarded from anywhere. Lead value props with the three concrete reseller realities: - No FCC reporting (no 499-A/Q, no RMD recert) - No USAC/USF on your revenue (contribution sits upstream) - No STIR/SHAKEN to set up or run (reseller can't get a US token; upstream signs) Add: No FCC Section 214 / no ongoing 214 burden -- CRTC BITS is a cheap, low-burden notification by comparison. Header/subject reworked; keeps the honest US-termination + upstream-signing explanation.	2026-06-18 00:10:06 -05:00
justin	d9ecb94b27	CRTC USF email: add honest US-termination + STIR/SHAKEN section Address the two most common objections truthfully (researched against CRTC, FCC 2025 Third-Party Authentication Order, and STIR/SHAKEN cross-border docs): - US-based long-distance termination operators routinely accept traffic from Canadian carriers (cross-border voice is a standard interconnect). - STIR/SHAKEN: a Canadian reseller cannot get a US SPC token (US-carrier-only), so US-bound calls are signed by the upstream US-number provider that assigns the DIDs -- exactly how most small US carriers already rely on upstream signing. Canadian-origin traffic falls under the lighter CRTC regime, handled by the upstream Canadian carrier. Does NOT claim S/S disappears -- it moves to the upstream, off the carrier's day-to-day operation.	2026-06-18 00:03:31 -05:00
justin	8099afc5ab	CRTC USF email: note US DIDs available from Canadian carriers + point to guide Address the obvious 'but I need US numbers' objection: several Canadian wholesale carriers (Fibernetics, Iristel, VoIP.ms, Telnyx, Bandwidth, Twilio, Frontier) provision US DIDs to CRTC-registered carriers, so they can keep serving US customers from the Canadian entity. Adds a Canada-advantage bullet and updates the guide block to call out both US + Canadian DIDs.	2026-06-17 23:53:19 -05:00
justin	1c63e8f4b5	CRTC USF email: add FCC photo-ID KYC requirement to the burden list + Canada contrast The FCC's 2025 Robocall Mitigation Order (47 CFR 64.1200(n)(4), FCC 25-6) requires collecting + authenticating a government-issued photo ID for every new customer before turning up voice service. Add it to the US-carrier burden list and the matching 'does not apply in Canada' advantage.	2026-06-17 23:46:04 -05:00
justin	2611b5458b	CRTC USF campaign: shared campaign_helpers + Q3 38.8% USF email builder - campaign_helpers.py: extract the branded Listmonk HTML helpers (hdr/flagbar/ stats/cta/footer/P/UL/etc.) + create_campaign() from create_campaigns.py into a side-effect-free shared module; create_campaign() now takes an altbody so every campaign ships a plaintext alternative (deliverability). - create_crtc_usf_campaign.py: build the one-off CRTC email hooked on the Q3 2026 USF factor (38.8%, +1.8pts, eff Jul 1), with a $200-off CANADA200 banner (expires Fri 23:59 ET, CTA links carry ?code= for auto-apply), the full US carrier burden vs Canada advantage, BC/ON incorporation, and a hosted carrier-guide PDF download. Creates a DRAFT only; sending stays manual.	2026-06-17 23:40:01 -05:00
justin	e379e2b10f	CRTC: ERPNext as portal source of truth + harden discount expiry + carrier guide PDF - checkout.ts: generalize ensureCompliancePortalUser -> ensurePortalUser and call it in the CRTC post-payment path so PayPal/crypto/webhook-confirmed CRTC orders always get an ERPNext Customer + Website User (the single source of truth for portal login/password), matching the compliance fix from the PayPal incident. Also flip portal_user_created for canada_crtc/formation. - canada-crtc.ts: enforce discount active+start/expiry windows, global usage limit and applies_to scope server-side at checkout (was active-only), so a promo like CANADA200 actually stops working after its expiry. - scripts/generate_canada_carrier_guide_pdf.py: render the public Canadian wholesale carrier/vendor guide PDF (reuses the canonical VENDORS list) to site/public/guides/canada-carrier-guide.pdf for the CRTC campaign lead magnet.	2026-06-17 23:34:13 -05:00
justin	eed5e4a258	campaigns: disable daily discount by default — test normal-price deals The daily 40%-off coupon was being merged into every trucking/UCR/IFTA/OTC send, but those discount sends were not actually being delivered (the DKIM-broken window). Now that deliverability is fixed, re-test whether normal-price offers convert before giving margin away. New CAMPAIGN_ENABLE_COUPON env flag (default OFF) gates daily-coupon minting in build_trucking_campaigns + the UCR/IFTA/OTC builders (which import it as tc.COUPON_ENABLED). With it off, no code is minted and an empty coupon_code is merged -> the campaign templates' existing {{ if .Subscriber.Attribs.coupon_code }} guard falls through to the normal-price {{ else }} branch and landing-page links carry no ?code=. No template or DB changes; fully reversible (set CAMPAIGN_ENABLE_COUPON=1). Verified: COUPON_ENABLED defaults False, coupon_attribs(None) -> empty, lp_link drops ?code= when no coupon, all 4 builders compile.	2026-06-17 22:51:28 -05:00
justin	a04ecf7df3	chore(email): decommission SMTP2GO references — local MTA only SMTP2GO is no longer used: Listmonk relays through the local Postfix MTA (172.18.0.1:25 from the Docker network), which DKIM-signs and delivers direct-to-recipient-MX; transactional mail goes through Carbonio. Verified zero smtp2go in any live container env + postfix has no external relayhost. Removed the stale references so a rebuild/new dev can't re-introduce it: - api/src/config.ts: SMTP_HOST default mail.smtp2go.com -> co.carrierone.com - scripts/workers/crypto_payment_worker.py: same default fix - infra/ansible all.yml: listmonk_smtp_* now 172.18.0.1:25, no auth (+comment) - app.env.j2 / email.ts / crm.md / go-live-todo.md / architecture.svg: docs	2026-06-17 22:46:59 -05:00
justin	eba525f83f	docs: runbook fix #8 — telecom/transactional HTML-only plaintext fix + campaign 407 finding	2026-06-17 21:17:06 -05:00
justin	b375385efd	fix(email): add text/plain part to every transactional + telecom email All transactional/worker senders built multipart/alternative (or mixed) messages with ONLY an HTML part. A single-part multipart/alternative is malformed and HTML-only mail is a spam-score signal -- the same class of deliverability bug that hurt the campaign pipeline, but on the telecom / filing / customer-transactional path (499-Q reminders, RMD/FCC filing review links, intake/completion/delivery emails, commissions, etc). - worker_email.send_worker_email: auto-derive plaintext from HTML when caller omits text= (fixes the shared helper for all current+future use) - 16 rolled-their-own senders in scripts/workers/** + scripts/formation/ document_delivery.py: attach html_to_text(...) plaintext sibling before the HTML part (job_server + document_delivery wrap text+html in an alternative sub-part so PDFs still attach to the mixed root) - api/src/email.ts: add dependency-free htmlToText() and default sendEmail text to it (fixes checkout/webhook HTML-only sends) Verified: all py files compile + import at runtime, api tsc passes, htmlToText handles hrefs/lists/entities, 11 plaintext unit tests pass. Telecom campaign 407 (Jun 8) was HTML-only + sent in the DKIM-broken window -> 384 sent / 0 clicks (same junked-mail signature).	2026-06-17 21:07:40 -05:00
justin	899b880e7f	trucking: weekly FMCSA source refresh so new non-compliant carriers are caught The FMCSA census was a one-time snapshot (last loaded ~May 30) with NO refresh timer -- carriers newly falling out of MCS-150/UCR compliance were never picked up. New scripts/workers/fmcsa_source_refresh.py orchestrates the full pipeline (census download -> enrichment -> deficiency flag -> verify new emails -> MX-tag new) and runs weekly via cron pw-fmcsa-refresh (Sun 09:00 UTC), codified in the mail-pipeline Ansible role. Idempotent + incremental: the census upsert preserves email_verified / listmonk_sent_at / deficiency_flags, so existing carriers keep their send state and only census fields refresh; new DOTs flow into verification then campaigns. A carrier who refiled gets a fresh mcs150_parsed, so the builder's overdue WHERE clause stops targeting them automatically. Verify is capped per run (20k) so it never stalls on millions of rows. (Healthcare already auto-catches newly-revalidation-overdue providers within its 63k institutional pool via pw-hc-refresh Mon/Wed/Fri.)	2026-06-17 20:44:54 -05:00
justin	4171f48736	docs: record post-incident email hardening (7 fixes) in runbook	2026-06-17 20:30:59 -05:00
justin	466460112b	email: handle unquoted hrefs in plaintext converter + add tests The anchor regex only matched quoted hrefs; unquoted (href=URL) dropped the URL from the plaintext part. Now handles double/single/unquoted. Added scripts/test_email_plaintext.py (11 cases: link forms, mailto, template-tag preservation, tag stripping, entity unescape, blank-line collapse).	2026-06-17 20:28:15 -05:00
justin	4dc5690666	infra: codify the email-campaign pipeline in Ansible (new mail-pipeline role) The entire outbound campaign pipeline lived ONLY on the host and was never in IaC -- a fresh rebuild would have silently shipped NO campaigns, NO IP warmup/ ramp, and NO bounce processing. New mail-pipeline role + deploy-mail-pipeline.yml playbook deploy it from the canonical repo copies: cron.d (infra/cron/): - pw-trucking-campaign-builder, pw-ifta-campaign, pw-ucr-campaign - pw-hc-campaign, pw-hc-nppes, pw-hc-refresh - pw-mta-warmup, pw-listmonk-rampcap, pw-hc-rampcap - pw-ip-rehab, pw-warmup-tg-alert helper scripts (-> /usr/local/bin): - pw-mta-warmup, pw-listmonk-rampcap, pw-hc-rampcap, pw-warmup-tg-alert - postfix-bounce-notify.sh, postfix-hc-bounce-notify.sh, listmonk-bounce-sync.py systemd services: - pw-bounce-watcher.service (was missing from repo), pw-hc-bounce-watcher.service Also creates the deploy-owned {{project_dir}}/logs dir (deploy can't write /var/log, so a missing dir made cron redirects fail). Added the 6 cron.d files that existed only on the host, the trucking bounce-watcher unit, and synced infra/cron/pw-hc-refresh to the live version (revalidation download + enrich steps). Role wired into site.yml after the mail (OpenDKIM) role. Part of the email-deliverability incident hardening.	2026-06-17 20:26:01 -05:00
justin	c183957939	email: suppress defunct/legacy/satellite ISP domains in cold sends Added DEAD_ISP_DOMAINS (52 domains) to BLOCKED_EMAIL_DOMAINS, so every campaign builder that imports the shared exclusions (trucking, UCR, IFTA via create_and_schedule_campaign, and the healthcare importer) stops cold-mailing them. Domains were identified from our own Listmonk bounce table (top bounced recipient domains) cross-checked against ISP status: defunct dial-up brands (earthlink, netzero, juno, mindspring...), Qwest/Embarq legacy, satellite (hughes, wildblue, dishmail), Altice/Suddenlink rural, WOW!/Knology, small rural ISPs (windstream, tds, iowatelecom...) and Alaska regional. Deliberately keeps still-active large consumer ISPs (comcast/charter/cox/ centurylink) -- their bounces were the cold-IP/no-DKIM reputation problem (now fixed), not dead mailboxes, and they carry real prospects. Part of the email-deliverability incident hardening.	2026-06-17 20:16:00 -05:00
justin	a32a3b05a0	email: add plaintext MIME part + stable Message-ID hostname Two deliverability hardening fixes from the email audit: 1. Plaintext (altbody): all campaigns were HTML-only. Listmonk only emits multipart/alternative when altbody is set, and HTML-only bulk mail is a spam-score signal. New scripts/_email_plaintext.py renders a readable text/plain part from the HTML body (dependency-free; preserves Listmonk {{ .Subscriber }}/{{ UnsubscribeURL }} template tags, turns links into 'text (url)'). Wired into the trucking builder (and thus UCR + IFTA, which reuse create_and_schedule_campaign) and the healthcare builder. 2. Stable container hostname: Listmonk derived its Message-ID from the random docker container id -> @localhost.localdomain (spam-score signal). Pin both listmonk + listmonk-hc hostname to perfwest.performancewest.net, matching Listmonk's SMTP hello_hostname. Part of the email-deliverability incident hardening.	2026-06-17 20:09:02 -05:00
justin	2e4388a803	mail: add logrotate for Postfix mail.log (postlogd copytruncate) mail.log had no logrotate rule and grew unbounded to ~1GB (~150MB/day) since Jun 8. This host logs via Postfix's built-in postlogd (maillog_file mode), not rsyslog (no rsyslog.service exists), so postlogd holds the file open -- a plain rename+create would leave it writing to the stale inode. Use copytruncate (no daemon signal needed). Rotate daily, keep 14 days compressed. Applied live: forced first rotation, compressed the 1GB archive (->99MB), verified logging + bounce watchers + DKIM signing intact. Part of the email-deliverability incident hardening (follows DKIM fix `4d59019`).	2026-06-17 19:47:13 -05:00
justin	4d5901921e	mail: fix OpenDKIM not signing campaign mail (Docker-injected) + codify in Ansible Root cause of the Jun 2026 deliverability collapse / 'no new sales': opendkim.conf was in single-key mode with no InternalHosts, so it signed only 127.0.0.1. Transactional/cron mail (injected locally) was signed, but ALL campaign mail -- injected over the Docker bridge from the Listmonk containers (172.18.0.5 trucking, 172.18.0.25 healthcare) -- went out UNSIGNED. Gmail/Yahoo require DKIM on bulk mail since Feb 2024, so cold campaigns were junked/blocked (~23% delivery, 550-5.7.1). Proof: 2,620 campaign msgs that day, 0 DKIM sigs. The correct table files already existed on the server but were never wired into opendkim.conf. Fix points the daemon at key.table/signing.table and sets InternalHosts/ExternalIgnoreList to trusted.hosts (which includes 172.16.0.0/12, the Docker subnet). Fixes BOTH streams: HC submission ports 2526-2528 inherit the global smtpd_milters and *@performancewest.net covers compliance@. Verified by injecting from a Docker IP through port 25 and port 2526 -- both now get 'DKIM-Signature field added'. Codified as new Ansible role 'mail' so it can't silently regress (OpenDKIM was previously not in IaC at all).	2026-06-17 19:31:19 -05:00
justin	f7212b3969	scripts: one-off fresh password-set link for Paul Wilson (ERPNext auth)	2026-06-17 10:19:53 -05:00
justin	9c87759501	auth: make ERPNext the single source of truth for customer passwords Customer portal login previously checked a bcrypt customers.password_hash in Postgres, while portal.performancewest.net validated against ERPNext — two stores that drifted (the Paul Wilson lockout). Consolidate on ERPNext: - erpnext-client: add verifyWebsiteUserPassword() — delegates the credential check to Frappe /api/method/login (Host header = site name; 200=ok,401=bad). - portal-auth /login: verify against ERPNext, then mint the pw_customer cookie. - portal-auth /register: create+set the ERPNext password (authority) and upsert a password-less customers profile row; takeover guard still honors any legacy PG password until the column is dropped. - portal-auth /reset-password + /forgot-password: write the new password to ERPNext; forgot-password now also works for ERPNext-only users (creates the PG profile row on demand). - Legacy customers with only a PG bcrypt password reset via forgot-password. - checkout: refresh the stale comment (customers row is now a profile, no pw). Build + typecheck green.	2026-06-17 10:09:32 -05:00
justin	557b45f65d	fix(erpnext): self-heal outgoing Email Account password from SMTP_* env Root cause of recurring 'Password not found for Email Account Performance West Outgoing': the account was shipped as a fixture with awaiting_password=1 and no password. Email Account SMTP passwords are encrypted per-site and cannot live in a fixture, so every `bench migrate` reimported the fixture and re-broke outgoing mail (login notifications, password resets, welcome emails). - Remove the Email Account fixture (it cannot carry the encrypted secret). - Add email_account_sync.sync_outgoing_password: idempotent, exception-safe upsert that reconciles the account + password from SMTP_* env and clears awaiting_password. - Wire it to after_migrate (repairs at end of every deploy/migrate, right after fixtures import) and the daily scheduler (heals out-of-band restore/restart drift). - Pass SMTP_* into the erpnext + erpnext-scheduler containers so the sync has the secret (they previously had no SMTP env).	2026-06-17 09:48:28 -05:00
justin	1eb29f80be	fix(verifier): mx_unreachable was mislabeling live big-ISP mailboxes The verifier returned (True, 'mx_unreachable') when it couldn't complete a port-25 probe to ANY MX — marking 438,163 addresses email_verified=TRUE. But these are NOT dead: they're dominated by Comcast (13.7k), AT&T/SBCGlobal (13.5k), Verizon, Cox, Charter, Frontier, etc. — major ISPs that deliberately tarpit/refuse probes from unknown IPs. Confirmed from prod: comcast MX connects + returns 220. The probe failure ≠ undeliverable. Fix: return (False, 'mx_probe_blocked') — MX exists, deliverability UNKNOWN, must be confirmed by a real send. Excluded from PW campaigns; prime burner-verification target (burner_list_verify upgrades it to send_confirmed on delivery). Existing 438,163 mx_unreachable rows reclassified in prod to mx_probe_blocked / verified=FALSE.	2026-06-17 05:48:08 -05:00

1 2 3 4 5 ...

754 commits