Companion to the worker MinIO-retry fix. Makes the worker auto-recover from process death (crash, manual kill, missed boot trigger), not just MinIO outages. - start_worker.bat: propagate Python's exit code (exit /b %rc%) so Task Scheduler can actually detect a failed run (it previously always exited 0). - reconfigure_task.ps1 (new): re-registers PW-DocserverWorker with RestartCount=99 / 1-min interval, StartWhenAvailable, and two triggers — AtStartup plus a 5-min repeating trigger with MultipleInstances=IgnoreNew, so a dead worker relaunches within ~5 min and never double-runs. Idempotent. - install.ps1: same self-healing settings for fresh installs. - Verified on the box: killed the worker -> task relaunched it; firing again while running stayed at one instance. Docs updated to match reality: - docserver/README.md: new 'Reliability / self-healing' section. - document-generation.md: corrected the stale 'Flask DocServer :5050 / HTTP' description to the actual MinIO outbound-only transport. - e2e-test-plan.md: removed the outdated 'Word COM fails under SYSTEM / requires RDP after every reboot' limitation; now self-healing under SYSTEM session 0. - infrastructure.md: fixed VM spec (Win Server 2019, Word 16.0, Python 3.13, SSH port 22422) + self-healing note. - architecture.md / formation-system.md: trigger + self-healing details.
138 lines
5.2 KiB
Markdown
138 lines
5.2 KiB
Markdown
# Performance West — Document Conversion Worker
|
|
|
|
Converts DOCX files to pixel-perfect PDFs using Microsoft Word on a Windows VM.
|
|
No HTTP server, no open ports, no SSH tunnel needed.
|
|
|
|
## Architecture
|
|
|
|
The Windows VM connects **outbound** to MinIO only. No inbound access required.
|
|
|
|
```
|
|
Linux workers container MinIO (S3) Windows VM (any NAT)
|
|
│ │ │
|
|
├─ PUT docx ─────────────→│ │
|
|
│ to-convert/{id}.docx │←─ poll every 3s ───────┤
|
|
│ │ list to-convert/ │
|
|
│ │ ├─ Word.SaveAs PDF
|
|
│ │←─ PUT pdf ─────────────┤
|
|
│ │ converted/{id}.pdf │
|
|
│ │←─ DELETE docx ──────────┤
|
|
│←─ GET pdf ──────────────┤ │
|
|
│ converted/{id}.pdf │ │
|
|
└─ DELETE pdf ────────────┤ │
|
|
```
|
|
|
|
The `pdf_converter.py` on the Linux side uploads the DOCX and polls until
|
|
the PDF appears (up to `DOCSERVER_TIMEOUT` seconds, default 120).
|
|
|
|
If the Windows VM is unavailable or slow, conversion falls back automatically
|
|
to LibreOffice headless in the workers container (70-80% fidelity).
|
|
|
|
## Windows VM Requirements
|
|
|
|
- Windows 10/11 Pro or Windows Server 2022
|
|
- Microsoft Word (Office 2021+ recommended)
|
|
- Python 3.12+ (from python.org — check "Add to PATH")
|
|
- Outbound internet access to MinIO (HTTPS, no inbound ports needed)
|
|
|
|
## Setup
|
|
|
|
Run `install.ps1` as Administrator in PowerShell on the Windows VM:
|
|
|
|
```powershell
|
|
cd C:\path\to\docserver
|
|
|
|
.\install.ps1 `
|
|
-MinioEndpoint "minio.performancewest.net" `
|
|
-MinioPort 443 `
|
|
-MinioSecure $true `
|
|
-MinioAccessKey "your_access_key" `
|
|
-MinioSecretKey "your_secret_key"
|
|
```
|
|
|
|
This will:
|
|
1. Verify Python and Word are installed
|
|
2. Install `pywin32` and `minio` Python packages
|
|
3. Copy `docserver_worker.py` to `C:\docserver\`
|
|
4. Write `C:\docserver\docserver.env` with your MinIO credentials
|
|
5. Register a Task Scheduler task (`PW-DocserverWorker`) that starts at login
|
|
6. Start the worker immediately
|
|
|
|
The worker must run as a **logged-in user** — Word COM requires an interactive
|
|
Windows session and will fail under a system service account.
|
|
|
|
## Reliability / self-healing
|
|
|
|
The worker is designed to recover from outages without manual intervention:
|
|
|
|
- **MinIO outages don't kill it.** The worker retries the MinIO connection
|
|
indefinitely with capped exponential backoff (5s → 120s) instead of exiting,
|
|
and each poll cycle is wrapped so a transient network error / 502 just
|
|
rebuilds the client and keeps going. (Previously a single 502 made the worker
|
|
`sys.exit(1)`, leaving it dead until a reboot.)
|
|
- **Crashes / kills are auto-recovered by Task Scheduler.** The
|
|
`PW-DocserverWorker` task has:
|
|
- `RestartCount=99`, `RestartInterval=1 min` — relaunch if the action fails,
|
|
- **two triggers**: `AtStartup` plus a **repeating trigger every 5 minutes**
|
|
with `MultipleInstances=IgnoreNew`, so if the process ever dies (crash,
|
|
manual kill, or a missed boot trigger) it relaunches within ~5 min and
|
|
never runs more than one instance,
|
|
- `StartWhenAvailable` to catch up a missed trigger.
|
|
- `start_worker.bat` **propagates Python's exit code** (`exit /b %rc%`) so
|
|
Scheduler can actually detect a failed run.
|
|
|
|
To re-apply these task settings on an existing install, run as Administrator:
|
|
|
|
```powershell
|
|
powershell -ExecutionPolicy Bypass -File C:\docserver\reconfigure_task.ps1
|
|
```
|
|
|
|
## How to access MinIO externally
|
|
|
|
The Windows VM needs to reach MinIO. Options:
|
|
|
|
**A. MinIO exposed externally (simplest)**
|
|
Set `MINIO_ENDPOINT=minio.performancewest.net`, `MINIO_PORT=443`, `MINIO_SECURE=true`.
|
|
Add a MinIO nginx vhost on the Debian server that proxies port 443 → MinIO port 9000.
|
|
|
|
**B. VPN / WireGuard**
|
|
Connect the Windows VM to the same private network as the Debian server.
|
|
Use the internal IP `192.168.x.x:9000` and `MINIO_SECURE=false`.
|
|
|
|
**C. Cloudflare Tunnel**
|
|
Run a cloudflared tunnel on the Debian server and connect from Windows.
|
|
|
|
## Heartbeat monitoring
|
|
|
|
The worker writes `minio://{bucket}/docserver-heartbeat.json` every 60 seconds:
|
|
|
|
```json
|
|
{
|
|
"status": "ok",
|
|
"word_version": "16.0",
|
|
"host": "WINVM-01",
|
|
"ts": "2026-04-05T12:00:00+00:00"
|
|
}
|
|
```
|
|
|
|
Read this to check if the worker is alive. The `health_check()` function in
|
|
`pdf_converter.py` reads it automatically.
|
|
|
|
## Manual test
|
|
|
|
Place a `.docx` file in `minio://{bucket}/to-convert/test.docx` and watch for
|
|
`minio://{bucket}/converted/test.pdf` to appear within a few seconds.
|
|
|
|
Using the MinIO web console (`http://server:9001`) or `mc` CLI:
|
|
|
|
```bash
|
|
mc cp mydoc.docx local/performancewest/to-convert/test.docx
|
|
# wait a few seconds...
|
|
mc ls local/performancewest/converted/
|
|
mc cp local/performancewest/converted/test.pdf ./test.pdf
|
|
```
|
|
|
|
## Logs
|
|
|
|
Worker logs: `C:\docserver\logs\worker.log`
|
|
Task Scheduler log: Event Viewer → Task Scheduler → `PW-DocserverWorker`
|