Initial commit — Performance West telecom compliance platform
Includes: API (Express/TypeScript), Astro site, Python workers, document generators, FCC compliance tools, Canada CRTC formation, Ansible infrastructure, and deployment scripts. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
commit
f8cd37ac8c
1823 changed files with 145167 additions and 0 deletions
112
docserver/README.md
Normal file
112
docserver/README.md
Normal file
|
|
@ -0,0 +1,112 @@
|
|||
# Performance West — Document Conversion Worker
|
||||
|
||||
Converts DOCX files to pixel-perfect PDFs using Microsoft Word on a Windows VM.
|
||||
No HTTP server, no open ports, no SSH tunnel needed.
|
||||
|
||||
## Architecture
|
||||
|
||||
The Windows VM connects **outbound** to MinIO only. No inbound access required.
|
||||
|
||||
```
|
||||
Linux workers container MinIO (S3) Windows VM (any NAT)
|
||||
│ │ │
|
||||
├─ PUT docx ─────────────→│ │
|
||||
│ to-convert/{id}.docx │←─ poll every 3s ───────┤
|
||||
│ │ list to-convert/ │
|
||||
│ │ ├─ Word.SaveAs PDF
|
||||
│ │←─ PUT pdf ─────────────┤
|
||||
│ │ converted/{id}.pdf │
|
||||
│ │←─ DELETE docx ──────────┤
|
||||
│←─ GET pdf ──────────────┤ │
|
||||
│ converted/{id}.pdf │ │
|
||||
└─ DELETE pdf ────────────┤ │
|
||||
```
|
||||
|
||||
The `pdf_converter.py` on the Linux side uploads the DOCX and polls until
|
||||
the PDF appears (up to `DOCSERVER_TIMEOUT` seconds, default 120).
|
||||
|
||||
If the Windows VM is unavailable or slow, conversion falls back automatically
|
||||
to LibreOffice headless in the workers container (70-80% fidelity).
|
||||
|
||||
## Windows VM Requirements
|
||||
|
||||
- Windows 10/11 Pro or Windows Server 2022
|
||||
- Microsoft Word (Office 2021+ recommended)
|
||||
- Python 3.12+ (from python.org — check "Add to PATH")
|
||||
- Outbound internet access to MinIO (HTTPS, no inbound ports needed)
|
||||
|
||||
## Setup
|
||||
|
||||
Run `install.ps1` as Administrator in PowerShell on the Windows VM:
|
||||
|
||||
```powershell
|
||||
cd C:\path\to\docserver
|
||||
|
||||
.\install.ps1 `
|
||||
-MinioEndpoint "minio.performancewest.net" `
|
||||
-MinioPort 443 `
|
||||
-MinioSecure $true `
|
||||
-MinioAccessKey "your_access_key" `
|
||||
-MinioSecretKey "your_secret_key"
|
||||
```
|
||||
|
||||
This will:
|
||||
1. Verify Python and Word are installed
|
||||
2. Install `pywin32` and `minio` Python packages
|
||||
3. Copy `docserver_worker.py` to `C:\docserver\`
|
||||
4. Write `C:\docserver\docserver.env` with your MinIO credentials
|
||||
5. Register a Task Scheduler task (`PW-DocserverWorker`) that starts at login
|
||||
6. Start the worker immediately
|
||||
|
||||
The worker must run as a **logged-in user** — Word COM requires an interactive
|
||||
Windows session and will fail under a system service account.
|
||||
|
||||
## How to access MinIO externally
|
||||
|
||||
The Windows VM needs to reach MinIO. Options:
|
||||
|
||||
**A. MinIO exposed externally (simplest)**
|
||||
Set `MINIO_ENDPOINT=minio.performancewest.net`, `MINIO_PORT=443`, `MINIO_SECURE=true`.
|
||||
Add a MinIO nginx vhost on the Debian server that proxies port 443 → MinIO port 9000.
|
||||
|
||||
**B. VPN / WireGuard**
|
||||
Connect the Windows VM to the same private network as the Debian server.
|
||||
Use the internal IP `192.168.x.x:9000` and `MINIO_SECURE=false`.
|
||||
|
||||
**C. Cloudflare Tunnel**
|
||||
Run a cloudflared tunnel on the Debian server and connect from Windows.
|
||||
|
||||
## Heartbeat monitoring
|
||||
|
||||
The worker writes `minio://{bucket}/docserver-heartbeat.json` every 60 seconds:
|
||||
|
||||
```json
|
||||
{
|
||||
"status": "ok",
|
||||
"word_version": "16.0",
|
||||
"host": "WINVM-01",
|
||||
"ts": "2026-04-05T12:00:00+00:00"
|
||||
}
|
||||
```
|
||||
|
||||
Read this to check if the worker is alive. The `health_check()` function in
|
||||
`pdf_converter.py` reads it automatically.
|
||||
|
||||
## Manual test
|
||||
|
||||
Place a `.docx` file in `minio://{bucket}/to-convert/test.docx` and watch for
|
||||
`minio://{bucket}/converted/test.pdf` to appear within a few seconds.
|
||||
|
||||
Using the MinIO web console (`http://server:9001`) or `mc` CLI:
|
||||
|
||||
```bash
|
||||
mc cp mydoc.docx local/performancewest/to-convert/test.docx
|
||||
# wait a few seconds...
|
||||
mc ls local/performancewest/converted/
|
||||
mc cp local/performancewest/converted/test.pdf ./test.pdf
|
||||
```
|
||||
|
||||
## Logs
|
||||
|
||||
Worker logs: `C:\docserver\logs\worker.log`
|
||||
Task Scheduler log: Event Viewer → Task Scheduler → `PW-DocserverWorker`
|
||||
373
docserver/docserver_worker.py
Normal file
373
docserver/docserver_worker.py
Normal file
|
|
@ -0,0 +1,373 @@
|
|||
r"""
|
||||
Performance West -- Document Conversion Worker (Windows)
|
||||
|
||||
Polls a MinIO bucket for DOCX files, converts them to PDF using
|
||||
Microsoft Word COM automation, and drops the PDF back into MinIO.
|
||||
|
||||
No HTTP server, no open ports, no SSH tunnel required.
|
||||
The Windows VM only needs outbound HTTPS access to MinIO.
|
||||
|
||||
Protocol
|
||||
---------
|
||||
Input: minio://{bucket}/to-convert/{job_id}.docx
|
||||
Output: minio://{bucket}/converted/{job_id}.pdf
|
||||
Cleanup: deletes the input DOCX after successful conversion
|
||||
|
||||
The Linux pdf_converter.py polls converted/ until the PDF appears
|
||||
(up to DOCSERVER_TIMEOUT seconds), then downloads and removes it.
|
||||
|
||||
Heartbeat
|
||||
---------
|
||||
Every 60 seconds this worker writes a tiny heartbeat object:
|
||||
minio://{bucket}/docserver-heartbeat.json
|
||||
Content: {"status":"ok","word_version":"...","ts":"...","host":"..."}
|
||||
The health_check() in pdf_converter.py reads this to detect if the
|
||||
worker is alive without needing a network round-trip to the VM.
|
||||
|
||||
Setup
|
||||
-----
|
||||
1. Copy this file + requirements_windows.txt to C:\docserver\ on the Windows VM
|
||||
2. pip install -r C:\docserver\requirements_windows.txt
|
||||
3. Set the MinIO env vars (see docserver.env or pass via Task Scheduler)
|
||||
4. Run: python docserver_worker.py
|
||||
Or let install.ps1 register it as a Task Scheduler task
|
||||
|
||||
Environment variables
|
||||
---------------------
|
||||
MINIO_ENDPOINT -- MinIO host:port (e.g. minio.performancewest.net or IP:9000)
|
||||
MINIO_PORT -- MinIO port (default 9000)
|
||||
MINIO_ACCESS_KEY -- access key
|
||||
MINIO_SECRET_KEY -- secret key
|
||||
MINIO_BUCKET -- bucket (default: performancewest)
|
||||
MINIO_SECURE -- true/false (default: false for internal; true for external)
|
||||
POLL_INTERVAL -- seconds between polls (default: 12)
|
||||
HEARTBEAT_INTERVAL -- seconds between heartbeats (default: 60)
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import logging
|
||||
import os
|
||||
import platform
|
||||
import shutil
|
||||
import socket
|
||||
import sys
|
||||
import tempfile
|
||||
import threading
|
||||
import time
|
||||
import uuid
|
||||
from datetime import datetime, timezone
|
||||
from pathlib import Path
|
||||
|
||||
LOG = logging.getLogger("docserver_worker")
|
||||
logging.basicConfig(
|
||||
level=logging.INFO,
|
||||
format="%(asctime)s [%(levelname)s] %(message)s",
|
||||
handlers=[
|
||||
logging.StreamHandler(sys.stdout),
|
||||
logging.FileHandler(
|
||||
os.path.join(os.getenv("LOG_DIR", r"C:\docserver\logs"), "worker.log"),
|
||||
encoding="utf-8",
|
||||
),
|
||||
],
|
||||
)
|
||||
|
||||
# ── Configuration ─────────────────────────────────────────────────────────────
|
||||
|
||||
_ENDPOINT = os.getenv("MINIO_ENDPOINT", "minio.performancewest.net")
|
||||
_PORT = int(os.getenv("MINIO_PORT", "9000"))
|
||||
_ACCESS = os.getenv("MINIO_ACCESS_KEY", "")
|
||||
_SECRET = os.getenv("MINIO_SECRET_KEY", "")
|
||||
_BUCKET = os.getenv("MINIO_BUCKET", "performancewest")
|
||||
_SECURE = os.getenv("MINIO_SECURE", "false").lower() == "true"
|
||||
|
||||
_PREFIX_IN = "to-convert" # input: DOCX files from Linux
|
||||
_PREFIX_OUT = "converted" # output: PDF files for Linux to pick up
|
||||
|
||||
_POLL_INTERVAL = int(os.getenv("POLL_INTERVAL", "12"))
|
||||
_HEARTBEAT_INTERVAL = int(os.getenv("HEARTBEAT_INTERVAL", "60"))
|
||||
|
||||
# Word COM constants
|
||||
_WD_FORMAT_PDF = 17
|
||||
_WD_DO_NOT_SAVE_CHANGES = 0
|
||||
|
||||
# ── Word COM singleton ────────────────────────────────────────────────────────
|
||||
|
||||
_word_app = None
|
||||
_word_lock = threading.Lock()
|
||||
|
||||
|
||||
def _get_word():
|
||||
"""Return the Word COM application, creating it if necessary.
|
||||
|
||||
Retries up to 3 times with increasing delays to handle DCOM startup latency
|
||||
when running under SYSTEM via Task Scheduler (Session 0 + DCOM RunAs).
|
||||
"""
|
||||
global _word_app
|
||||
if _word_app is not None:
|
||||
try:
|
||||
_ = _word_app.Visible # probe — raises if Word died
|
||||
return _word_app
|
||||
except Exception:
|
||||
LOG.warning("Word COM instance died — restarting...")
|
||||
_word_app = None
|
||||
|
||||
import win32com.client # type: ignore
|
||||
import pythoncom # type: ignore
|
||||
|
||||
max_retries = 3
|
||||
for attempt in range(1, max_retries + 1):
|
||||
try:
|
||||
pythoncom.CoInitialize()
|
||||
_word_app = win32com.client.DispatchEx("Word.Application")
|
||||
if _word_app is None:
|
||||
raise RuntimeError("DispatchEx returned None")
|
||||
_word_app.Visible = False
|
||||
_word_app.DisplayAlerts = False
|
||||
_word_app.AutomationSecurity = 3 # msoAutomationSecurityForceDisable
|
||||
LOG.info("Word COM started — version %s", _word_app.Version)
|
||||
return _word_app
|
||||
except Exception as e:
|
||||
LOG.warning("Word COM init attempt %d/%d failed: %s", attempt, max_retries, e)
|
||||
_word_app = None
|
||||
if attempt < max_retries:
|
||||
delay = attempt * 10 # 10s, 20s
|
||||
LOG.info(" Retrying in %ds...", delay)
|
||||
time.sleep(delay)
|
||||
else:
|
||||
LOG.error("Word COM failed after %d attempts. Is DCOM configured? "
|
||||
"Run fix_dcom.bat as Administrator.", max_retries)
|
||||
raise
|
||||
|
||||
|
||||
def _quit_word():
|
||||
global _word_app
|
||||
if _word_app:
|
||||
try:
|
||||
_word_app.Quit()
|
||||
except Exception:
|
||||
pass
|
||||
_word_app = None
|
||||
|
||||
|
||||
def _convert_docx_to_pdf(docx_path: Path, pdf_path: Path) -> bool:
|
||||
"""Convert one DOCX to PDF via Word COM. Serialised by _word_lock."""
|
||||
with _word_lock:
|
||||
word = _get_word()
|
||||
doc = None
|
||||
try:
|
||||
doc = word.Documents.Open(
|
||||
str(docx_path.resolve()),
|
||||
ReadOnly=True,
|
||||
AddToRecentFiles=False,
|
||||
Visible=False,
|
||||
)
|
||||
doc.SaveAs2(str(pdf_path.resolve()), FileFormat=_WD_FORMAT_PDF)
|
||||
size = pdf_path.stat().st_size if pdf_path.exists() else 0
|
||||
LOG.info("Converted: %s → %s (%d bytes)", docx_path.name, pdf_path.name, size)
|
||||
return pdf_path.exists() and size > 0
|
||||
except Exception as exc:
|
||||
LOG.error("Conversion failed for %s: %s", docx_path.name, exc)
|
||||
return False
|
||||
finally:
|
||||
if doc:
|
||||
try:
|
||||
doc.Close(SaveChanges=_WD_DO_NOT_SAVE_CHANGES)
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# ── MinIO helpers ─────────────────────────────────────────────────────────────
|
||||
|
||||
def _mc():
|
||||
from minio import Minio # type: ignore
|
||||
return Minio(
|
||||
f"{_ENDPOINT}:{_PORT}",
|
||||
access_key=_ACCESS,
|
||||
secret_key=_SECRET,
|
||||
secure=_SECURE,
|
||||
)
|
||||
|
||||
|
||||
def _ensure_bucket(mc) -> None:
|
||||
if not mc.bucket_exists(_BUCKET):
|
||||
mc.make_bucket(_BUCKET)
|
||||
LOG.info("Created bucket: %s", _BUCKET)
|
||||
|
||||
|
||||
def _list_pending(mc) -> list[str]:
|
||||
"""Return object names under to-convert/ that end in .docx.
|
||||
|
||||
Ignores .tmp_ prefixed files — those are still being uploaded atomically
|
||||
by the Linux side and are not ready for processing yet.
|
||||
"""
|
||||
try:
|
||||
objects = mc.list_objects(_BUCKET, prefix=f"{_PREFIX_IN}/", recursive=False)
|
||||
return [
|
||||
obj.object_name
|
||||
for obj in objects
|
||||
if obj.object_name.endswith(".docx")
|
||||
and "/.tmp_" not in obj.object_name
|
||||
]
|
||||
except Exception as exc:
|
||||
LOG.error("Failed to list pending jobs: %s", exc)
|
||||
return []
|
||||
|
||||
|
||||
# ── Main processing loop ──────────────────────────────────────────────────────
|
||||
|
||||
def _process_one(mc, in_key: str) -> None:
|
||||
"""Download one DOCX from MinIO, convert, upload the PDF, delete the DOCX."""
|
||||
job_id = Path(in_key).stem # e.g. "abc123"
|
||||
out_key = f"{_PREFIX_OUT}/{job_id}.pdf"
|
||||
|
||||
# Skip if the PDF is already there (duplicate poll before delete completed)
|
||||
try:
|
||||
mc.stat_object(_BUCKET, out_key)
|
||||
LOG.info("Job %s already converted — skipping", job_id[:8])
|
||||
return
|
||||
except Exception:
|
||||
pass # expected — PDF doesn't exist yet
|
||||
|
||||
work_dir = Path(tempfile.mkdtemp(prefix=f"docserver_{job_id[:8]}_"))
|
||||
docx_path = work_dir / f"{job_id}.docx"
|
||||
pdf_path = work_dir / f"{job_id}.pdf"
|
||||
|
||||
try:
|
||||
# 1. Download DOCX
|
||||
LOG.info("[%s] Downloading %s", job_id[:8], in_key)
|
||||
mc.fget_object(_BUCKET, in_key, str(docx_path))
|
||||
|
||||
# 2. Convert
|
||||
LOG.info("[%s] Converting via Word...", job_id[:8])
|
||||
t0 = time.monotonic()
|
||||
success = _convert_docx_to_pdf(docx_path, pdf_path)
|
||||
elapsed = time.monotonic() - t0
|
||||
|
||||
if not success:
|
||||
LOG.error("[%s] Conversion failed — leaving DOCX in to-convert/ for retry", job_id[:8])
|
||||
return
|
||||
|
||||
LOG.info("[%s] Converted in %.1fs", job_id[:8], elapsed)
|
||||
|
||||
# 3. Upload PDF to converted/ — atomic via tmp + rename
|
||||
# Upload to .tmp_ first, then server-side copy to final key.
|
||||
# Linux side polls stat_object(out_key) — it won't see the .tmp_.
|
||||
from minio.commonconfig import CopySource # type: ignore
|
||||
tmp_out = f"{_PREFIX_OUT}/.tmp_{job_id}.pdf"
|
||||
mc.fput_object(
|
||||
_BUCKET, tmp_out, str(pdf_path),
|
||||
content_type="application/pdf",
|
||||
metadata={
|
||||
"x-amz-meta-job-id": job_id,
|
||||
"x-amz-meta-elapsed": f"{elapsed:.1f}s",
|
||||
},
|
||||
)
|
||||
mc.copy_object(_BUCKET, out_key, CopySource(_BUCKET, tmp_out))
|
||||
mc.remove_object(_BUCKET, tmp_out)
|
||||
LOG.info("[%s] Uploaded PDF → minio://%s/%s (atomic)", job_id[:8], _BUCKET, out_key)
|
||||
|
||||
# 4. Delete the input DOCX so it doesn't get processed again
|
||||
mc.remove_object(_BUCKET, in_key)
|
||||
LOG.info("[%s] Removed input DOCX from to-convert/", job_id[:8])
|
||||
|
||||
except Exception as exc:
|
||||
LOG.error("[%s] Unexpected error processing %s: %s", job_id[:8], in_key, exc)
|
||||
finally:
|
||||
shutil.rmtree(work_dir, ignore_errors=True)
|
||||
|
||||
|
||||
def _heartbeat_loop(word_version: str) -> None:
|
||||
"""Write a heartbeat object to MinIO every HEARTBEAT_INTERVAL seconds."""
|
||||
mc = _mc()
|
||||
hostname = socket.gethostname()
|
||||
while True:
|
||||
try:
|
||||
payload = json.dumps({
|
||||
"status": "ok",
|
||||
"word_version": word_version,
|
||||
"host": hostname,
|
||||
"ts": datetime.now(timezone.utc).isoformat(),
|
||||
}).encode()
|
||||
mc.put_object(
|
||||
_BUCKET,
|
||||
"docserver-heartbeat.json",
|
||||
__import__("io").BytesIO(payload),
|
||||
length=len(payload),
|
||||
content_type="application/json",
|
||||
)
|
||||
except Exception as exc:
|
||||
LOG.warning("Heartbeat write failed: %s", exc)
|
||||
time.sleep(_HEARTBEAT_INTERVAL)
|
||||
|
||||
|
||||
def main() -> None:
|
||||
LOG.info("Performance West Document Conversion Worker starting...")
|
||||
LOG.info(" Python: %s", sys.version.split()[0])
|
||||
LOG.info(" Platform: %s", platform.platform())
|
||||
LOG.info(" MinIO: %s:%d / bucket=%s", _ENDPOINT, _PORT, _BUCKET)
|
||||
|
||||
# Log session/user info for debugging COM issues
|
||||
try:
|
||||
import getpass
|
||||
LOG.info(" User: %s", getpass.getuser())
|
||||
import ctypes
|
||||
session_id = ctypes.windll.kernel32.WTSGetActiveConsoleSessionId()
|
||||
LOG.info(" Session: %d (console)", session_id)
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
if not _ACCESS or not _SECRET:
|
||||
LOG.error("MINIO_ACCESS_KEY / MINIO_SECRET_KEY not set -- cannot start")
|
||||
sys.exit(1)
|
||||
|
||||
# Verify Word is available before accepting work
|
||||
LOG.info("Initialising Word COM...")
|
||||
try:
|
||||
with _word_lock:
|
||||
word = _get_word()
|
||||
word_version = word.Version
|
||||
LOG.info("Word %s ready", word_version)
|
||||
except Exception as exc:
|
||||
LOG.error("Word COM failed to initialise: %s", exc)
|
||||
LOG.error("Fix: run fix_dcom.bat as Administrator, then reboot.")
|
||||
LOG.error("Or RDP in to create an interactive session, then the AtLogOn task will fire.")
|
||||
sys.exit(1)
|
||||
|
||||
# Verify MinIO connectivity
|
||||
LOG.info("Connecting to MinIO...")
|
||||
try:
|
||||
mc = _mc()
|
||||
_ensure_bucket(mc)
|
||||
LOG.info("MinIO connected — bucket '%s' ready", _BUCKET)
|
||||
except Exception as exc:
|
||||
LOG.error("MinIO connection failed: %s", exc)
|
||||
sys.exit(1)
|
||||
|
||||
# Start heartbeat background thread
|
||||
hb = threading.Thread(target=_heartbeat_loop, args=(word_version,), daemon=True)
|
||||
hb.start()
|
||||
LOG.info("Heartbeat thread started (interval=%ds)", _HEARTBEAT_INTERVAL)
|
||||
|
||||
LOG.info("Polling to-convert/ every %ds — waiting for jobs...", _POLL_INTERVAL)
|
||||
|
||||
try:
|
||||
while True:
|
||||
pending = _list_pending(mc)
|
||||
if pending:
|
||||
LOG.info("Found %d pending job(s)", len(pending))
|
||||
for key in pending:
|
||||
_process_one(mc, key)
|
||||
time.sleep(_POLL_INTERVAL)
|
||||
except KeyboardInterrupt:
|
||||
LOG.info("Shutting down...")
|
||||
finally:
|
||||
_quit_word()
|
||||
LOG.info("Worker stopped.")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
# Ensure log directory exists
|
||||
log_dir = Path(os.getenv("LOG_DIR", r"C:\docserver\logs"))
|
||||
log_dir.mkdir(parents=True, exist_ok=True)
|
||||
main()
|
||||
53
docserver/fix_dcom.bat
Normal file
53
docserver/fix_dcom.bat
Normal file
|
|
@ -0,0 +1,53 @@
|
|||
@echo off
|
||||
REM ============================================================
|
||||
REM Fix Word COM for Session 0 (services/Task Scheduler)
|
||||
REM
|
||||
REM Problem: Word COM fails with 'NoneType' when run from
|
||||
REM Task Scheduler "Run whether user is logged on or not"
|
||||
REM because Session 0 has no interactive desktop.
|
||||
REM
|
||||
REM Solution: Configure DCOM to launch Word under the 'justin'
|
||||
REM user context regardless of which session requests it.
|
||||
REM This is the standard fix for Office COM automation from
|
||||
REM Windows Services and Task Scheduler.
|
||||
REM
|
||||
REM Run this script ONCE as Administrator.
|
||||
REM ============================================================
|
||||
|
||||
echo.
|
||||
echo [1/4] Creating Desktop folders for SYSTEM and SysWOW64...
|
||||
mkdir "C:\Windows\System32\config\systemprofile\Desktop" 2>nul
|
||||
mkdir "C:\Windows\SysWOW64\config\systemprofile\Desktop" 2>nul
|
||||
echo Done.
|
||||
|
||||
echo.
|
||||
echo [2/4] Configuring Word DCOM to run as 'justin'...
|
||||
REM Word 16.0 (Office 365) CLSID: {000209FF-0000-0000-C000-000000000046}
|
||||
REM This sets the "RunAs" identity for the Word COM server
|
||||
reg add "HKLM\SOFTWARE\Classes\AppID\{000209FF-0000-0000-C000-000000000046}" /v RunAs /t REG_SZ /d ".\justin" /f
|
||||
reg add "HKLM\SOFTWARE\Classes\AppID\{000209FF-0000-0000-C000-000000000046}" /v RunAsPassword /t REG_SZ /d "H73g1tKGE3#Aakf" /f
|
||||
echo Done.
|
||||
|
||||
echo.
|
||||
echo [3/4] Setting DCOM default launch/access permissions...
|
||||
REM Grant SYSTEM and justin full DCOM access
|
||||
REM (This uses dcomcnfg equivalent registry settings)
|
||||
REM The default permissions are usually sufficient, but we ensure
|
||||
REM the AppID is registered for Word
|
||||
reg add "HKLM\SOFTWARE\Classes\AppID\WINWORD.EXE" /v AppID /t REG_SZ /d "{000209FF-0000-0000-C000-000000000046}" /f
|
||||
echo Done.
|
||||
|
||||
echo.
|
||||
echo [4/4] Recreating Task Scheduler task as ONSTART/SYSTEM...
|
||||
schtasks /delete /tn PW-DocserverWorker /f 2>nul
|
||||
schtasks /create /tn PW-DocserverWorker /tr "cmd.exe /c C:\docserver\start_worker.bat" /sc ONSTART /ru SYSTEM /rl HIGHEST /f
|
||||
echo Done.
|
||||
|
||||
echo.
|
||||
echo ============================================================
|
||||
echo DCOM fix applied. Word COM should now work from Session 0.
|
||||
echo The task will run as SYSTEM at startup, but Word will launch
|
||||
echo under the 'justin' user context via DCOM RunAs configuration.
|
||||
echo.
|
||||
echo Reboot to test: shutdown /r /t 5
|
||||
echo ============================================================
|
||||
227
docserver/install.ps1
Normal file
227
docserver/install.ps1
Normal file
|
|
@ -0,0 +1,227 @@
|
|||
# Performance West Document Conversion Worker — Windows Installation
|
||||
# Run as Administrator in PowerShell on the Windows VM
|
||||
#
|
||||
# What this does:
|
||||
# 1. Verifies Python + Microsoft Word are installed
|
||||
# 2. Installs Python dependencies (pywin32, minio)
|
||||
# 3. Copies docserver_worker.py + config to C:\docserver\
|
||||
# 4. Creates a Task Scheduler task that starts the worker at system boot
|
||||
# (runs as the installing user, "Run whether user is logged on or not"
|
||||
# — Word COM works in session 0 on Server 2019 with desktop interaction)
|
||||
#
|
||||
# Prerequisites:
|
||||
# - Windows 10/11 Pro or Windows Server 2022
|
||||
# - Microsoft Word installed (only Word needed, not full Office)
|
||||
# - Python 3.12+ (python.org — check "Add to PATH")
|
||||
# - Outbound HTTPS to minio.performancewest.net (or wherever MinIO lives)
|
||||
# - No inbound ports required — the VM connects OUT to MinIO only
|
||||
#
|
||||
# Usage:
|
||||
# .\install.ps1 -MinioEndpoint "minio.performancewest.net" `
|
||||
# -MinioPort 443 `
|
||||
# -MinioSecure $true `
|
||||
# -MinioAccessKey "your_access_key" `
|
||||
# -MinioSecretKey "your_secret_key"
|
||||
|
||||
param(
|
||||
[Parameter(Mandatory=$true)]
|
||||
[string]$MinioEndpoint,
|
||||
|
||||
[int] $MinioPort = 9000,
|
||||
[bool] $MinioSecure = $false,
|
||||
[Parameter(Mandatory=$true)]
|
||||
[string]$MinioAccessKey,
|
||||
[Parameter(Mandatory=$true)]
|
||||
[string]$MinioSecretKey,
|
||||
[string]$MinioBucket = "performancewest",
|
||||
[int] $PollInterval = 3,
|
||||
[string]$AppDir = "C:\docserver"
|
||||
)
|
||||
|
||||
$ErrorActionPreference = "Stop"
|
||||
|
||||
Write-Host ""
|
||||
Write-Host "=== Performance West Document Conversion Worker Setup ===" -ForegroundColor Cyan
|
||||
Write-Host ""
|
||||
|
||||
# ── 0. Admin check ────────────────────────────────────────────────────────────
|
||||
if (-NOT ([Security.Principal.WindowsPrincipal][Security.Principal.WindowsIdentity]::GetCurrent()).IsInRole([Security.Principal.WindowsBuiltInRole]"Administrator")) {
|
||||
Write-Host "ERROR: Run this script as Administrator!" -ForegroundColor Red
|
||||
exit 1
|
||||
}
|
||||
|
||||
# ── 1. Python check ───────────────────────────────────────────────────────────
|
||||
Write-Host "Checking Python..." -ForegroundColor Yellow
|
||||
$python = Get-Command python -ErrorAction SilentlyContinue
|
||||
if (-not $python) {
|
||||
Write-Host "ERROR: Python not found." -ForegroundColor Red
|
||||
Write-Host " Download from https://python.org/downloads" -ForegroundColor Red
|
||||
Write-Host " Install with 'Add Python to PATH' checked, then re-run." -ForegroundColor Red
|
||||
exit 1
|
||||
}
|
||||
$pyVersion = python --version 2>&1
|
||||
Write-Host " Found: $pyVersion" -ForegroundColor Green
|
||||
|
||||
# ── 2. Word check ─────────────────────────────────────────────────────────────
|
||||
Write-Host "Checking Microsoft Word..." -ForegroundColor Yellow
|
||||
try {
|
||||
$word = New-Object -ComObject Word.Application
|
||||
$wordVersion = $word.Version
|
||||
$word.Quit()
|
||||
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($word) | Out-Null
|
||||
Write-Host " Found: Microsoft Word $wordVersion" -ForegroundColor Green
|
||||
} catch {
|
||||
Write-Host "ERROR: Microsoft Word not found or COM registration broken." -ForegroundColor Red
|
||||
Write-Host " Install Microsoft Word and retry." -ForegroundColor Red
|
||||
exit 1
|
||||
}
|
||||
|
||||
# ── 3. Create application directory ──────────────────────────────────────────
|
||||
Write-Host "Creating $AppDir..." -ForegroundColor Yellow
|
||||
New-Item -ItemType Directory -Path $AppDir -Force | Out-Null
|
||||
New-Item -ItemType Directory -Path "$AppDir\logs" -Force | Out-Null
|
||||
New-Item -ItemType Directory -Path "$AppDir\temp" -Force | Out-Null
|
||||
Copy-Item -Path "$PSScriptRoot\docserver_worker.py" -Destination $AppDir -Force
|
||||
Copy-Item -Path "$PSScriptRoot\requirements.txt" -Destination $AppDir -Force
|
||||
Write-Host " Files copied." -ForegroundColor Green
|
||||
|
||||
# ── 4. Install Python dependencies ───────────────────────────────────────────
|
||||
Write-Host "Installing Python dependencies..." -ForegroundColor Yellow
|
||||
python -m pip install --upgrade pip --quiet
|
||||
python -m pip install -r "$AppDir\requirements.txt" --quiet
|
||||
# pywin32 post-install COM registration
|
||||
$pyPrefix = python -c "import sys; print(sys.prefix)"
|
||||
$postInstall = "$pyPrefix\Scripts\pywin32_postinstall.py"
|
||||
if (Test-Path $postInstall) {
|
||||
python $postInstall -install 2>$null
|
||||
Write-Host " pywin32 COM registration done." -ForegroundColor Green
|
||||
}
|
||||
Write-Host " Dependencies installed." -ForegroundColor Green
|
||||
|
||||
# ── 5. Write environment config ──────────────────────────────────────────────
|
||||
$envContent = @"
|
||||
MINIO_ENDPOINT=$MinioEndpoint
|
||||
MINIO_PORT=$MinioPort
|
||||
MINIO_SECURE=$($MinioSecure.ToString().ToLower())
|
||||
MINIO_ACCESS_KEY=$MinioAccessKey
|
||||
MINIO_SECRET_KEY=$MinioSecretKey
|
||||
MINIO_BUCKET=$MinioBucket
|
||||
POLL_INTERVAL=$PollInterval
|
||||
LOG_DIR=$AppDir\logs
|
||||
TEMP_DIR=$AppDir\temp
|
||||
"@
|
||||
Set-Content -Path "$AppDir\docserver.env" -Value $envContent -Encoding UTF8
|
||||
Write-Host " Config written to $AppDir\docserver.env" -ForegroundColor Green
|
||||
|
||||
# ── 6. Write startup batch script ────────────────────────────────────────────
|
||||
# Reads .env, sets env vars, then starts the worker
|
||||
$startScript = @'
|
||||
@echo off
|
||||
setlocal
|
||||
cd /d C:\docserver
|
||||
|
||||
:: Load environment variables from docserver.env
|
||||
for /f "usebackq tokens=1,* delims==" %%a in ("C:\docserver\docserver.env") do (
|
||||
if not "%%a"=="" (
|
||||
set "line=%%a"
|
||||
if not "!line:~0,1!"=="#" set "%%a=%%b"
|
||||
)
|
||||
)
|
||||
|
||||
:: Start the worker
|
||||
python C:\docserver\docserver_worker.py >> C:\docserver\logs\worker.log 2>&1
|
||||
endlocal
|
||||
'@
|
||||
# Note: the batch script uses delayed expansion so we write it separately
|
||||
$startScriptFull = @'
|
||||
@echo off
|
||||
setlocal enabledelayedexpansion
|
||||
cd /d C:\docserver
|
||||
|
||||
echo [%date% %time%] Starting Performance West Docserver Worker...
|
||||
|
||||
for /f "usebackq tokens=1,* delims==" %%a in ("C:\docserver\docserver.env") do (
|
||||
set "ln=%%a"
|
||||
if not "!ln:~0,1!"=="#" (
|
||||
if not "%%a"=="" set "%%a=%%b"
|
||||
)
|
||||
)
|
||||
|
||||
python C:\docserver\docserver_worker.py
|
||||
echo [%date% %time%] Worker exited with code %errorlevel%.
|
||||
endlocal
|
||||
'@
|
||||
Set-Content -Path "$AppDir\start_worker.bat" -Value $startScriptFull -Encoding ASCII
|
||||
|
||||
# ── 7. Register Task Scheduler task ──────────────────────────────────────────
|
||||
Write-Host "Registering Task Scheduler task..." -ForegroundColor Yellow
|
||||
|
||||
$taskName = "PW-DocserverWorker"
|
||||
$currentUser = [System.Security.Principal.WindowsIdentity]::GetCurrent().Name
|
||||
|
||||
Unregister-ScheduledTask -TaskName $taskName -Confirm:$false -ErrorAction SilentlyContinue
|
||||
|
||||
$action = New-ScheduledTaskAction `
|
||||
-Execute "cmd.exe" `
|
||||
-Argument "/c `"$AppDir\start_worker.bat`"" `
|
||||
-WorkingDirectory $AppDir
|
||||
|
||||
$trigger = New-ScheduledTaskTrigger -AtStartup
|
||||
|
||||
$settings = New-ScheduledTaskSettingsSet `
|
||||
-ExecutionTimeLimit (New-TimeSpan -Hours 0) `
|
||||
-RestartCount 10 `
|
||||
-RestartInterval (New-TimeSpan -Minutes 1) `
|
||||
-StartWhenAvailable `
|
||||
-MultipleInstances IgnoreNew `
|
||||
-AllowStartIfOnBatteries `
|
||||
-DontStopIfGoingOnBatteries
|
||||
|
||||
# Register as the current user with "Run whether user is logged on or not"
|
||||
# This allows the task to start at boot without requiring an interactive login.
|
||||
# Word COM works in session 0 on Windows Server 2019.
|
||||
$password = Read-Host -Prompt "Enter password for $currentUser (needed for 'Run whether logged on or not')" -AsSecureString
|
||||
$plainPassword = [System.Runtime.InteropServices.Marshal]::PtrToStringAuto(
|
||||
[System.Runtime.InteropServices.Marshal]::SecureStringToBSTR($password)
|
||||
)
|
||||
|
||||
Register-ScheduledTask `
|
||||
-TaskName $taskName `
|
||||
-Action $action `
|
||||
-Trigger $trigger `
|
||||
-Settings $settings `
|
||||
-User $currentUser `
|
||||
-Password $plainPassword `
|
||||
-RunLevel Highest `
|
||||
-Description "Performance West DOCX-to-PDF worker (MinIO + Word COM)" | Out-Null
|
||||
|
||||
Write-Host " Task '$taskName' registered (runs at boot, restarts on failure)." -ForegroundColor Green
|
||||
|
||||
# ── 8. Start the task now ─────────────────────────────────────────────────────
|
||||
Write-Host "Starting worker task..." -ForegroundColor Yellow
|
||||
Start-ScheduledTask -TaskName $taskName
|
||||
Start-Sleep -Seconds 5
|
||||
|
||||
# ── 9. Verify ────────────────────────────────────────────────────────────────
|
||||
$taskInfo = Get-ScheduledTask -TaskName $taskName
|
||||
Write-Host ""
|
||||
Write-Host "═══════════════════════════════════════════════════════════" -ForegroundColor Cyan
|
||||
Write-Host " Setup Complete" -ForegroundColor Green
|
||||
Write-Host "═══════════════════════════════════════════════════════════" -ForegroundColor Cyan
|
||||
Write-Host ""
|
||||
Write-Host " Task state: $($taskInfo.State)"
|
||||
Write-Host " App dir: $AppDir"
|
||||
Write-Host " Logs: $AppDir\logs\worker.log"
|
||||
Write-Host " Config: $AppDir\docserver.env"
|
||||
Write-Host ""
|
||||
Write-Host " MinIO endpoint: $MinioEndpoint`:$MinioPort"
|
||||
Write-Host " MinIO bucket: $MinioBucket"
|
||||
Write-Host " Poll interval: ${PollInterval}s"
|
||||
Write-Host ""
|
||||
Write-Host " The worker polls minio://$MinioBucket/to-convert/" -ForegroundColor White
|
||||
Write-Host " Converted PDFs appear in minio://$MinioBucket/converted/" -ForegroundColor White
|
||||
Write-Host ""
|
||||
Write-Host " To verify manually, place a .docx in to-convert/ and watch" -ForegroundColor White
|
||||
Write-Host " converted/ for the resulting .pdf (should appear within a few seconds)." -ForegroundColor White
|
||||
Write-Host ""
|
||||
Write-Host "═══════════════════════════════════════════════════════════" -ForegroundColor Cyan
|
||||
5
docserver/requirements.txt
Normal file
5
docserver/requirements.txt
Normal file
|
|
@ -0,0 +1,5 @@
|
|||
# Performance West — Windows Docserver Worker
|
||||
# pip install -r requirements.txt
|
||||
|
||||
pywin32>=306 # Word COM automation
|
||||
minio>=7.2.0 # MinIO S3 client
|
||||
Loading…
Add table
Add a link
Reference in a new issue