#16572: Feat/tts piper language routing

by akalypse open 2026-02-14 21:49 View on GitHub →

stale size: M

Cluster: Text-to-Speech Provider Enhancements

## Summary - Problem: Piper core support lacked language-aware routing for voice and endpoint selection. - Why it matters: Multilingual setups need automatic mapping to language-specific voices/services without manual provider switching. - What changed: Added optional language detection and mapping (voiceByLang, baseUrlByLang) for Piper, plus tests and dependency updates. - What did NOT change (scope boundary): Core Piper provider integration stays in PR2; no docker/infra compose changes are included. ## Change Type (select all) - [ ] Bug fix - [x] Feature - [ ] Refactor - [ ] Docs - [ ] Security hardening - [ ] Chore/infra ## Scope (select all touched areas) - [ ] Gateway / orchestration - [x] Skills / tool execution - [ ] Auth / tokens - [ ] Memory / storage - [x] Integrations - [x] API / contracts - [ ] UI / DX - [ ] CI/CD / infra ## Linked Issue/PR - Closes # - Related # - Depends on: #16569 (feat/tts-piper-provider-core) ## User-visible / Behavior Changes - Added optional Piper config: - messages.tts.piper.voiceByLang - messages.tts.piper.baseUrlByLang - TTS can now auto-select Piper voice and/or endpoint based on detected text language. - Existing single-endpoint/single-voice behavior remains unchanged when mappings are not provided. ## Security Impact (required) - New permissions/capabilities? (No) - Secrets/tokens handling changed? (No) - New/changed network calls? (No) (same Piper HTTP call path as PR2; only selection logic changed) - Command/tool execution surface changed? (No) - Data access scope changed? (No) - If any Yes, explain risk + mitigation: ## Repro + Verification ### Environment - OS: macOS - Runtime/container: local dev - Model/provider: TTS with Piper selected - Integration/channel (if any): TTS runtime path - Relevant config (redacted): - messages.tts.provider: "piper" - messages.tts.piper.voiceByLang: { "fr": "...", "de": "..." } - messages.tts.piper.baseUrlByLang: { "fr": "...", "de": "..." } ### Steps 1. Configure Piper with voiceByLang and/or baseUrlByLang. 2. Send multilingual TTS text (e.g., French/German/Hindi). 3. Verify selected voice/base URL follows mapping rules. 4. Run targeted TTS tests. ### Expected - Language detection drives mapped Piper voice/base URL where configured. - Fallback to default voice/baseUrl when no mapping exists. ### Actual - Matches expected after this change. ## Evidence Attach at least one: - [x] Failing test/log before + passing after - [ ] Trace/log snippets - [ ] Screenshot/recording - [ ] Perf numbers (if relevant) ## Human Verification (required) - Verified scenarios: - Added tests in src/tts/tts.test.ts for language detection + routing behavior. - Ran pnpm test -- src/tts/tts.test.ts (passing). - Edge cases checked: - Mapping fallback when no language match. - Override voice interaction with endpoint mapping path. - What you did not verify: - Full real-world language quality/accuracy across all languages/channels. ## Compatibility / Migration - Backward compatible? (Yes) - Config/env changes? (Optional) - Migration needed? (No) - If yes, exact upgrade steps: - Optional: define voiceByLang/baseUrlByLang under messages.tts.piper. ## Failure Recovery (if this breaks) - How to disable/revert this change quickly: - Remove voiceByLang/baseUrlByLang from config, or revert this commit. - Files/config to restore: - src/tts/tts.ts - src/tts/tts.test.ts - src/config/types.tts.ts - src/config/zod-schema.core.ts - package.json - pnpm-lock.yaml - Known bad symptoms reviewers should watch for: - Wrong voice/endpoint chosen for multilingual text. - Unexpected fallback behavior despite mappings. ## Risks and Mitigations - Risk: Language detection may misclassify short/ambiguous text. - Mitigation: conservative detection thresholds and fallback to default voice/base URL. - Risk: Additional dependency (franc-min) increases maintenance surface. - Mitigation: small, focused dependency; behavior covered with targeted tests.  <h3>Greptile Summary</h3> This PR adds language-aware voice and endpoint routing for the Piper TTS provider. It introduces the `franc-min` dependency for language detection, adds `voiceByLang` and `baseUrlByLang` config maps, and wires Piper into the existing TTS provider infrastructure (config schema, provider ordering, commands, telephony exclusion). - **Temp directory leak**: The Piper block in `textToSpeech` creates a temp directory before the async `piperTTS()` call but has no cleanup on failure, unlike the edge provider which explicitly removes the temp dir in its catch blocks. - **Incomplete ISO 639-3 → ISO 639-1 mapping**: `detectLanguage` only maps 4 languages (en, de, fr, hi). For any other language, `franc` returns a 3-letter code that won't match user-configured 2-letter keys in `voiceByLang`/`baseUrlByLang`, causing silent fallback to defaults. - **Piper always appears as "configured"**: The hardcoded default `baseUrl` means `isTtsProviderConfigured` always returns `true` for Piper, adding it to every user's fallback chain even without a running Piper service. <h3>Confidence Score: 2/5</h3> - PR has a resource leak bug and a language detection gap that will cause silent misbehavior for most languages beyond the 4 currently mapped. - Score of 2 reflects two concrete bugs: (1) temp directory leak when piperTTS fails, which is a resource leak in a production code path, and (2) detectLanguage returning 3-letter ISO codes for unmapped languages, which silently breaks voice/URL routing for any language beyond en/de/fr/hi. The always-configured default URL is a design concern that adds unnecessary fallback latency for users without Piper. - src/tts/tts.ts requires attention for the temp directory cleanup bug in the piper block (lines 812-843), the incomplete ISO3_TO_1 mapping (line 564-571), and the always-true isTtsProviderConfigured check (lines 675-677). <sub>Last reviewed commit: c41e208</sub>