# Performance West — Document Conversion Worker Converts DOCX files to pixel-perfect PDFs using Microsoft Word on a Windows VM. No HTTP server, no open ports, no SSH tunnel needed. ## Architecture The Windows VM connects **outbound** to MinIO only. No inbound access required. ``` Linux workers container MinIO (S3) Windows VM (any NAT) │ │ │ ├─ PUT docx ─────────────→│ │ │ to-convert/{id}.docx │←─ poll every 3s ───────┤ │ │ list to-convert/ │ │ │ ├─ Word.SaveAs PDF │ │←─ PUT pdf ─────────────┤ │ │ converted/{id}.pdf │ │ │←─ DELETE docx ──────────┤ │←─ GET pdf ──────────────┤ │ │ converted/{id}.pdf │ │ └─ DELETE pdf ────────────┤ │ ``` The `pdf_converter.py` on the Linux side uploads the DOCX and polls until the PDF appears (up to `DOCSERVER_TIMEOUT` seconds, default 120). If the Windows VM is unavailable or slow, conversion falls back automatically to LibreOffice headless in the workers container (70-80% fidelity). ## Windows VM Requirements - Windows 10/11 Pro or Windows Server 2022 - Microsoft Word (Office 2021+ recommended) - Python 3.12+ (from python.org — check "Add to PATH") - Outbound internet access to MinIO (HTTPS, no inbound ports needed) ## Setup Run `install.ps1` as Administrator in PowerShell on the Windows VM: ```powershell cd C:\path\to\docserver .\install.ps1 ` -MinioEndpoint "minio.performancewest.net" ` -MinioPort 443 ` -MinioSecure $true ` -MinioAccessKey "your_access_key" ` -MinioSecretKey "your_secret_key" ``` This will: 1. Verify Python and Word are installed 2. Install `pywin32` and `minio` Python packages 3. Copy `docserver_worker.py` to `C:\docserver\` 4. Write `C:\docserver\docserver.env` with your MinIO credentials 5. Register a Task Scheduler task (`PW-DocserverWorker`) that starts at login 6. Start the worker immediately The worker must run as a **logged-in user** — Word COM requires an interactive Windows session and will fail under a system service account. ## How to access MinIO externally The Windows VM needs to reach MinIO. Options: **A. MinIO exposed externally (simplest)** Set `MINIO_ENDPOINT=minio.performancewest.net`, `MINIO_PORT=443`, `MINIO_SECURE=true`. Add a MinIO nginx vhost on the Debian server that proxies port 443 → MinIO port 9000. **B. VPN / WireGuard** Connect the Windows VM to the same private network as the Debian server. Use the internal IP `192.168.x.x:9000` and `MINIO_SECURE=false`. **C. Cloudflare Tunnel** Run a cloudflared tunnel on the Debian server and connect from Windows. ## Heartbeat monitoring The worker writes `minio://{bucket}/docserver-heartbeat.json` every 60 seconds: ```json { "status": "ok", "word_version": "16.0", "host": "WINVM-01", "ts": "2026-04-05T12:00:00+00:00" } ``` Read this to check if the worker is alive. The `health_check()` function in `pdf_converter.py` reads it automatically. ## Manual test Place a `.docx` file in `minio://{bucket}/to-convert/test.docx` and watch for `minio://{bucket}/converted/test.pdf` to appear within a few seconds. Using the MinIO web console (`http://server:9001`) or `mc` CLI: ```bash mc cp mydoc.docx local/performancewest/to-convert/test.docx # wait a few seconds... mc ls local/performancewest/converted/ mc cp local/performancewest/converted/test.pdf ./test.pdf ``` ## Logs Worker logs: `C:\docserver\logs\worker.log` Task Scheduler log: Event Viewer → Task Scheduler → `PW-DocserverWorker`