Service Workers, Stale Subs, and a Local Translation Stack

A recurring-breakage session: every bot or service we touched had something that "every update seems to break." We patched the symptoms in each case, then went up one level and tried to ship the architecture or the tests that should keep them from breaking again.

Service Worker Auth for FreeChat

Image rendering, image download, and same-origin unfurl thumbnails were all broken — root cause identical for all three. The uploads auth middleware (added under a recent security pass) requires a Bearer header on /uploads/files/*, but <img src>, <a href>, and bare fetch() calls have no way to attach one. Three options on the table: query-string tokens (token leaks to access logs and Referer headers — rejected), short-lived signed URLs (more backend code, doesn't fix encrypted-blob fetches), or a browser Service Worker. We shipped the Service Worker — same pattern Slack and Discord web use. The SW intercepts same-origin /uploads/... requests and attaches the Bearer token from in-memory state, broadcast via postMessage on every login and refresh. E2E preserved: the SW only adds an auth header, never decrypts payloads. 113 lines of SW + 84 lines of registration glue, 129 tests pass with three previously-skipped integration tests now flipped on. Three adjacent issues that surfaced during investigation got bundled into the same PR — Content-Disposition forcing image download instead of inline render, the preview cache poisoning failed-URL lookups for 24 hours, and a missing dev dep — though the dev dep got reverted because adding it triggered an 80-package vitest minor cascade we didn't want to land in a fix PR.

Translation Triage Across Two Bots

VibeBot4000 was triple-translating every chat message — duplicate EventSub subscriptions firing the same event multiple times, accumulated from reconnect loops without cleanup. While we were in there: it was also translating "Indian food = great turmeric input" as Hindi. The franc-min trigram detector is statistically weak on short text, and the AI confirm step followed the geographic anchor straight to a confident wrong answer. Phase 1 patches added a 3-word minimum after stripping punctuation, an 80% English-word-ratio short-circuit, an ACTION filter so emote messages stop getting routed, and a real-language allowlist of about 90 names so "Hindi" stops being a valid output for English text dotted with country names. Same patches landed on Jefebot. The VibeBot EventSub lifecycle got a separate review and fix — graceful shutdown handler, stale-subscription cleanup on startup via the Helix listing endpoint, and a reconnect-transfer flag so we don't double-subscribe across welcome/reconnect state transitions.

A Local Translation Service in Scaffolding

The deeper fix for the over-aggressive translate is a real translation service, not regex tuning. We evaluated NLLB-200, Madlad-400, M2M100, and the candidates that would run on existing infrastructure. vLLM doesn't support encoder-decoder translation as of 2026 (RFC #7366) and llama.cpp is decoder-only — both off the table. Picked NLLB-200-distilled-600M served via CTranslate2 4.7.1 on the RTX PRO 6000, with lingua-py for detection (much stronger than franc on short text). Scaffolded jefe-translate as a FastAPI service on port 8001 with /translate, /detect, /languages, and /health. Bound to localhost behind Traefik. Not deployed yet — model conversion sits in a benchmarks dir waiting to land at /opt/jefe/models/translate/ct2-nllb-600M, and Phase 2 will wire both bots to call this service instead of the existing OpenAI fallback path. Local LLMs first, OpenAI as failback or consumer-tier opt-in.

Infrastructure Cleanup In The Background

The Prometheus config on Titan and in git had drifted apart over weeks of independent edits. Backported Titan's network-topology header into the git copy, adopted the container-name targets (cadvisor, jefehome, db/redis exporters), and kept the disabled scrape jobs commented out per project preference. JefeHQ picked up two new epics: a Vault-driven rotation engine and a dynamic-secrets evaluation. The Option A pilot is the per-service rotation skill — with a --leak mode that never writes the previous value (because writing the password you're rotating because of a leak rather defeats the purpose) and a --routine mode that purges the previous key immediately on health-check success. The D&D / WOTC trademark scrub on jefeworks.com got a final site-wide pass after the first one missed five literals in the fallback response handler. Plus two small Jefebot fixes — the !filmtonight year parser now strips parens before matching (2023), and the misleading twitchChatConnected heartbeat field got dropped because it was measuring the unused IRC fallback while EventSub was actually working.

What's Next

Pick a name for jefe-translate and push to GitHub / Forgejo, copy the converted CT2 model into place, deploy on Titan
Phase 2: wire VibeBot and Jefebot to call jefe-translate instead of the OpenAI fallback
Pilot the Vault-driven rotation skill on a low-blast-radius service
Land @vitest/coverage-v8 in a focused PR with the full vitest minor bump cascade resolved
Phase 3: evaluate Madlad-400-3B for the commercial license path (Apache-2.0 vs NLLB's CC-BY-NC)