Companion to the worker MinIO-retry fix. Makes the worker auto-recover from
process death (crash, manual kill, missed boot trigger), not just MinIO outages.
- start_worker.bat: propagate Python's exit code (exit /b %rc%) so Task
Scheduler can actually detect a failed run (it previously always exited 0).
- reconfigure_task.ps1 (new): re-registers PW-DocserverWorker with
RestartCount=99 / 1-min interval, StartWhenAvailable, and two triggers —
AtStartup plus a 5-min repeating trigger with MultipleInstances=IgnoreNew, so
a dead worker relaunches within ~5 min and never double-runs. Idempotent.
- install.ps1: same self-healing settings for fresh installs.
- Verified on the box: killed the worker -> task relaunched it; firing again
while running stayed at one instance.
Docs updated to match reality:
- docserver/README.md: new 'Reliability / self-healing' section.
- document-generation.md: corrected the stale 'Flask DocServer :5050 / HTTP'
description to the actual MinIO outbound-only transport.
- e2e-test-plan.md: removed the outdated 'Word COM fails under SYSTEM / requires
RDP after every reboot' limitation; now self-healing under SYSTEM session 0.
- infrastructure.md: fixed VM spec (Win Server 2019, Word 16.0, Python 3.13,
SSH port 22422) + self-healing note.
- architecture.md / formation-system.md: trigger + self-healing details.
New diagrams:
- business-flow.svg: acquisition → check → order → filing → delivery
- technical-architecture.svg: full Docker stack, data tier, external services
- order-flow.svg: detailed worker pipeline with eSign gate and handler map
Updated docs:
- infrastructure.md: DocServer, email servers, backup server sections
- architecture.md: linked to new SVGs, updated date
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>