#11745: ui: add server-side TTS for web chat via gateway endpoint

by wjlgatech open 2026-02-08 07:42 View on GitHub →

docs app: web-ui gateway stale

## Summary - Add `POST /api/tts/synthesize` gateway endpoint that uses the existing Edge TTS infrastructure (Microsoft neural voices — free, no API key needed) to synthesize speech server-side - Update the web chat UI to try server-side TTS first via `fetch` → blob → `HTMLAudioElement`, falling back to browser Speech API on any error - Pass gateway URL and token from app settings to the chat TTS module so it can reach the endpoint ## Test plan - [ ] Start gateway, open web chat, enable TTS (speaker button), send a message — verify audio plays with natural Edge TTS voice - [ ] Check network tab for `POST /api/tts/synthesize` returning `audio/mpeg` - [ ] Stop gateway, verify browser Speech API fallback kicks in - [ ] Verify unauthenticated requests to `/api/tts/synthesize` return 401 🤖 Generated with [Claude Code](https://claude.com/claude-code)  <h2>Greptile Overview</h2> <h3>Greptile Summary</h3> This PR adds a new authenticated gateway endpoint (`POST /api/tts/synthesize`) that synthesizes speech server-side using the existing Edge TTS infrastructure, returning an `audio/mpeg` payload. The web chat UI is updated to prefer this server-side TTS path (fetch → Blob → `HTMLAudioElement`) and to fall back to the browser Speech API on any errors. The UI wiring also passes the gateway base URL and token from app settings into the chat TTS module, and includes minor supporting UI/CSS/icon updates. This fits into the existing architecture by extending the gateway’s HTTP server routing with a TTS handler module, and by keeping client-side speech as a fallback for environments where the gateway is unavailable or blocked. <h3>Confidence Score: 3/5</h3> - This PR is close, but needs fixes in the new TTS streaming and browser playback path before it’s safe to merge. - The gateway endpoint streams audio without handling read-stream errors, which can hang responses or throw unhandled errors. The endpoint also hardcodes `audio/mpeg` which can be incorrect if the synthesizer returns a non-MP3 format. On the UI side, server-TTS playback creates blob URLs without revoking them, causing leaks during long/active chat sessions. - src/gateway/tts-http.ts, ui/src/ui/views/chat.ts  <sub>(2/5) Greptile learns from your feedback when you react with thumbs up/down!</sub> **Context used:** - Context from `dashboard` - CLAUDE.md ([source](https://app.greptile.com/review/custom-context?memory=fd949e91-5c3a-4ab5-90a1-cbe184fd6ce8)) - Context from `dashboard` - AGENTS.md ([source](https://app.greptile.com/review/custom-context?memory=0d0c8278-ef8e-4d6c-ab21-f5527e322f13))