docserver: self-healing Task Scheduler config + docs
Companion to the worker MinIO-retry fix. Makes the worker auto-recover from process death (crash, manual kill, missed boot trigger), not just MinIO outages. - start_worker.bat: propagate Python's exit code (exit /b %rc%) so Task Scheduler can actually detect a failed run (it previously always exited 0). - reconfigure_task.ps1 (new): re-registers PW-DocserverWorker with RestartCount=99 / 1-min interval, StartWhenAvailable, and two triggers — AtStartup plus a 5-min repeating trigger with MultipleInstances=IgnoreNew, so a dead worker relaunches within ~5 min and never double-runs. Idempotent. - install.ps1: same self-healing settings for fresh installs. - Verified on the box: killed the worker -> task relaunched it; firing again while running stayed at one instance. Docs updated to match reality: - docserver/README.md: new 'Reliability / self-healing' section. - document-generation.md: corrected the stale 'Flask DocServer :5050 / HTTP' description to the actual MinIO outbound-only transport. - e2e-test-plan.md: removed the outdated 'Word COM fails under SYSTEM / requires RDP after every reboot' limitation; now self-healing under SYSTEM session 0. - infrastructure.md: fixed VM spec (Win Server 2019, Word 16.0, Python 3.13, SSH port 22422) + self-healing note. - architecture.md / formation-system.md: trigger + self-healing details.
This commit is contained in:
parent
7929413eeb
commit
b48d0cb799
9 changed files with 150 additions and 24 deletions
|
|
@ -194,20 +194,36 @@ DOCX to PDF conversion uses a two-tier approach:
|
|||
|
||||
### PRIMARY: Windows DocServer (Microsoft Word COM)
|
||||
|
||||
A Windows server runs a Flask-based DocServer at `:5050` that uses Microsoft Word via COM
|
||||
automation for pixel-perfect DOCX → PDF conversion. This produces the highest-fidelity
|
||||
output (exact font rendering, correct page breaks, proper table formatting).
|
||||
A Windows server runs `docserver_worker.py` that uses Microsoft Word via COM
|
||||
automation for pixel-perfect DOCX → PDF conversion. This produces the highest-
|
||||
fidelity output (exact font rendering, correct page breaks, proper table
|
||||
formatting).
|
||||
|
||||
The transport is **MinIO, not HTTP** — the Windows VM only makes **outbound**
|
||||
connections to MinIO, so there are no open inbound ports / SSH tunnels and it
|
||||
works behind any NAT:
|
||||
|
||||
```text
|
||||
pdf_converter.py (Linux) MinIO (S3) docserver_worker.py (Windows)
|
||||
PUT docx → to-convert/{id}.docx ─────────► │
|
||||
│◄─ poll every 12s ───────┤
|
||||
│ ├─ Word.SaveAs → PDF
|
||||
GET pdf ← converted/{id}.pdf ◄──────────│◄─ PUT converted/{id}.pdf┘
|
||||
DEL docx / DEL pdf (cleanup)
|
||||
```
|
||||
|
||||
```python
|
||||
# pdf_converter.py — primary path
|
||||
response = requests.post(
|
||||
f"http://{DOCSERVER_HOST}:5050/convert",
|
||||
files={"file": open(docx_path, "rb")},
|
||||
timeout=60,
|
||||
)
|
||||
pdf_bytes = response.content
|
||||
# pdf_converter.py — primary path (simplified)
|
||||
mc.put_object(bucket, f"to-convert/{job_id}.docx", docx_stream, length)
|
||||
# ...poll until converted/{job_id}.pdf appears (DOCSERVER_TIMEOUT, default 120s)...
|
||||
pdf_bytes = mc.get_object(bucket, f"converted/{job_id}.pdf").read()
|
||||
```
|
||||
|
||||
The Windows worker is **self-healing**: it retries MinIO with backoff instead of
|
||||
exiting on a transient outage, and its `PW-DocserverWorker` scheduled task
|
||||
restarts on failure plus re-fires every 5 minutes if the process dies. See
|
||||
`docserver/README.md` → "Reliability / self-healing".
|
||||
|
||||
### FALLBACK: LibreOffice Headless
|
||||
|
||||
If DocServer is unavailable (network error, timeout, Windows server down), the converter
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue