#11704: feat(tts): OpenAI TTS baseUrl support for local servers (Chatterbox, Coqui, LocalAI)
size: M
Cluster:
Voice Call and TTS Improvements
## Summary
- Cherry-pick TTS `baseUrl` commit from #9736 (`00719bc`, thanks @divol89) and fix bugs identified in review
- Make local OpenAI-compatible TTS servers (Chatterbox, Coqui, LocalAI) work end-to-end without an OpenAI API key
- Fix media parser rejecting TTS tool `/tmp/` audio paths, which blocked voice delivery on all channels
Closes #9709
Ref: #9736
## What PR #9736 already did (commit `00719bc`)
- Added `baseUrl?: string` to `TtsConfig.openai` in `types.tts.ts`
- Added `baseUrl` to Zod schema in `zod-schema.core.ts`
- Resolved `baseUrl` in `resolveTtsConfig()` in `tts.ts`
- Passed `baseUrl` to `openaiTTS()` function
- Used config `baseUrl` with fallback to `getOpenAITtsBaseUrl()` (env var)
## Bugs fixed on top of the cherry-pick
1. **Missing type field** — `baseUrl` was not added to `ResolvedTtsConfig` type, causing type errors
2. **No URL normalization** — config `baseUrl` with trailing slash produced double-slash URLs (`http://host:4123/v1//audio/speech`). Now stripped via `.replace(/\/+$/, "")`
3. **`isCustomOpenAIEndpoint()` was env-only** — only checked `OPENAI_TTS_BASE_URL` env var, not the new config field. Model/voice validation wouldn't relax for config-based custom URLs. Now accepts optional `configBaseUrl` param, threaded through `isValidOpenAIModel()` and `isValidOpenAIVoice()`
4. **API key still required** — `resolveTtsApiKey()` returned `undefined` for openai without key, so the provider was skipped. Now returns `"local"` sentinel when `baseUrl` is set. `openaiTTS()` omits the `Authorization` header for the sentinel
5. **`isTtsProviderConfigured()` required API key** — openai with custom `baseUrl` but no key showed as unconfigured. Now treats openai as configured when `baseUrl` is set
6. **`baseUrl` not passed in `textToSpeech()` main path** — only passed in `textToSpeechTelephony()`. Added to the main provider loop
7. **Media parser rejected TTS audio paths** — `isValidMedia()` only accepted `./` relative paths and `https://` URLs. The TTS tool writes audio to `/tmp/tts-*/voice-*` and returns `MEDIA:/tmp/...`, which was rejected and sent as raw text instead of a voice attachment. Now allows `/tmp/` paths (no traversal). Other absolute paths remain blocked for LFI safety
## Tests added
### TTS tests (13 new in `tts.test.ts`)
- `resolveTtsConfig` resolves and trims `openai.baseUrl`
- `isCustomOpenAIEndpoint` returns true for config baseUrl and env var
- `resolveTtsApiKey` returns `"local"` sentinel when custom baseUrl set with no key; prefers real keys; returns undefined without either
- `isTtsProviderConfigured` returns true for openai with baseUrl, false without key or baseUrl
- `isValidOpenAIModel/Voice` relaxes validation when config baseUrl is set
### Media parser tests (2 new in `parse.test.ts`)
- Accepts `/tmp/` paths from internal tools (e.g. TTS)
- Rejects `/tmp/` paths with directory traversal
## Usage
```yaml
messages:
tts:
provider: openai
openai:
baseUrl: "http://localhost:8880/v1"
model: "chatterbox"
voice: "default"
```
No `apiKey` needed for local servers.
## Validation
- `pnpm build` — passes
- `pnpm check` — lint/format/types pass
- `pnpm test -- --run src/tts/tts.test.ts` — 48 tests pass (13 new)
- `pnpm test -- --run src/media/parse.test.ts` — 11 tests pass (2 new)
## Test plan
- [ ] `pnpm build && pnpm check` passes
- [ ] `pnpm test` passes (all TTS and media parser tests)
- [ ] Manual: set `messages.tts.openai.baseUrl` to a local Chatterbox instance, verify voice generation works without an OpenAI API key
- [ ] Verify Telegram voice notes arrive as voice bubbles (opus format)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Most Similar PRs
#9736: feat(tts): add baseUrl option to OpenAI TTS config (#9709)
by divol89 · 2026-02-05
80.9%
#22186: feat(talk): add baseUrl support for custom TTS servers
by bautrey · 2026-02-20
74.0%
#7258: feat(tts): add Inworld AI TTS provider
by willsinghwilson · 2026-02-02
71.4%
#22086: fix(tts): honor explicit config provider and model/voice settings
by AIflow-Labs · 2026-02-20
71.0%
#7965: feat(tts): add Speechify as TTS provider
by chaerla · 2026-02-03
70.8%
#19210: feat(tts): add OpenAI instructions parameter support
by keenranger · 2026-02-17
70.6%
#12039: TTS voice bubbles, node role auth, media fixes, CI lint cleanup
by dmiv · 2026-02-08
70.5%
#9041: feat(tts): Add post-processing hook for voice modulation
by robottwo · 2026-02-04
70.1%
#8339: fix(tts): validate ElevenLabs base URL against allowlist
by yubrew · 2026-02-03
69.9%
#21110: fix(tts): deliver audio via structured mediaUrl instead of MEDIA: t...
by hydro13 · 2026-02-19
69.8%