docserver: self-healing Task Scheduler config + docs
Companion to the worker MinIO-retry fix. Makes the worker auto-recover from process death (crash, manual kill, missed boot trigger), not just MinIO outages. - start_worker.bat: propagate Python's exit code (exit /b %rc%) so Task Scheduler can actually detect a failed run (it previously always exited 0). - reconfigure_task.ps1 (new): re-registers PW-DocserverWorker with RestartCount=99 / 1-min interval, StartWhenAvailable, and two triggers — AtStartup plus a 5-min repeating trigger with MultipleInstances=IgnoreNew, so a dead worker relaunches within ~5 min and never double-runs. Idempotent. - install.ps1: same self-healing settings for fresh installs. - Verified on the box: killed the worker -> task relaunched it; firing again while running stayed at one instance. Docs updated to match reality: - docserver/README.md: new 'Reliability / self-healing' section. - document-generation.md: corrected the stale 'Flask DocServer :5050 / HTTP' description to the actual MinIO outbound-only transport. - e2e-test-plan.md: removed the outdated 'Word COM fails under SYSTEM / requires RDP after every reboot' limitation; now self-healing under SYSTEM session 0. - infrastructure.md: fixed VM spec (Win Server 2019, Word 16.0, Python 3.13, SSH port 22422) + self-healing note. - architecture.md / formation-system.md: trigger + self-healing details.
This commit is contained in:
parent
7929413eeb
commit
b48d0cb799
9 changed files with 150 additions and 24 deletions
|
|
@ -238,7 +238,7 @@ Flags for support conversations (escalation, priority, category).
|
|||
- Converts via Word COM, drops PDF in `converted/` bucket
|
||||
- Heartbeat file at `minio://performancewest/docserver-heartbeat.json` (60s interval)
|
||||
- Atomic uploads via `.tmp_` prefix + `copy_object` rename
|
||||
- Task Scheduler: `PW-DocserverWorker` — auto-restart on failure
|
||||
- Task Scheduler: `PW-DocserverWorker` — self-healing: restarts on failure (99×/1 min) + AtStartup and a 5-min repeating trigger (relaunches within ~5 min if the process dies). The worker also retries MinIO on outage instead of exiting.
|
||||
- **Fallback:** LibreOffice headless (`soffice --headless --convert-to pdf`) auto-activates when DocServer heartbeat stale (>5 min)
|
||||
- **E2E tested:** 36KB DOCX → 82KB PDF in 12 seconds total round-trip
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue