#8317: fix(tts): add dynamic timeout and retry logic for ElevenLabs TTS
stale
Cluster:
Voice Call and TTS Improvements
## Summary
Addresses audio cutoff issues when generating TTS for longer text via ElevenLabs API.
- **Dynamic timeout scaling**: `MIN_TIMEOUT_MS (15s) + text_length × 30ms`, capped at `MAX_TIMEOUT_MS (120s)`
- **Retry logic**: Exponential backoff (3 attempts) for transient failures (timeouts, 5xx errors, network issues)
- **Smart retry skipping**: Doesn't retry auth/validation errors (401, 403, 422, invalid headers)
- **Audio buffer validation**: Checks minimum size (1KB) and validates MP3/OGG magic headers before returning
- **Enhanced diagnostic logging**: Logs latency, buffer size, and retry attempts for debugging
## Problem
ElevenLabs TTS can take 45-60+ seconds for longer text (1500+ chars), but the previous fixed 30-second timeout caused requests to abort mid-generation. This resulted in:
- Truncated/silent audio files
- No retry for transient network failures
- No validation that audio was complete
## Test plan
- [x] Unit tests for `calculateTtsTimeout()` - 4 test cases
- [x] Unit tests for `validateAudioBuffer()` - 6 test cases
- [x] All 43 TTS tests pass
- [ ] Manual testing with long text TTS generation
🤖 Generated with [Claude Code](https://claude.com/claude-code)
<!-- greptile_comment -->
<h2>Greptile Overview</h2>
<h3>Greptile Summary</h3>
This PR improves ElevenLabs TTS robustness by adding a text-length-based timeout calculation, wrapping the ElevenLabs fetch in `retryAsync` with exponential backoff, and validating returned audio buffers (minimum size + basic MP3/OGG header checks). Unit tests were extended to cover the new timeout calculation and buffer validation helpers.
In the existing `textToSpeech` flow, ElevenLabs requests now dynamically scale their abort timeout to better handle long synthesis jobs, and retries aim to recover from transient failures while skipping common non-retriable auth/validation cases.
<h3>Confidence Score: 4/5</h3>
- This PR is reasonably safe to merge and should improve ElevenLabs TTS reliability, with a couple edge cases worth tightening.
- Core change is localized to ElevenLabs request handling and is covered by new unit tests. Main concern is the buffer validation’s dependence on `outputFormat` being a non-empty string, which could throw or skip validation in edge cases and would interact poorly with retries/logging.
- src/tts/tts.ts
<!-- greptile_other_comments_section -->
<sub>(2/5) Greptile learns from your feedback when you react with thumbs up/down!</sub>
<!-- /greptile_comment -->
Most Similar PRs
#8922: feat(voice-call): Add ElevenLabs WebSocket streaming TTS
by mikiships · 2026-02-04
81.5%
#16089: fix(tts): clarify directive syntax in prompts and strip malformed tags
by kmixter · 2026-02-14
79.4%
#8339: fix(tts): validate ElevenLabs base URL against allowlist
by yubrew · 2026-02-03
79.1%
#8103: fix(tts): sanitize API keys from error messages
by yubrew · 2026-02-03
78.2%
#7965: feat(tts): add Speechify as TTS provider
by chaerla · 2026-02-03
76.1%
#19489: fix(voice-call): add echo suppression for TTS playback
by kalichkin · 2026-02-17
76.0%
#19073: feat(voice-call): streaming TTS, barge-in, silence filler, hangup, ...
by odrobnik · 2026-02-17
75.8%
#21110: fix(tts): deliver audio via structured mediaUrl instead of MEDIA: t...
by hydro13 · 2026-02-19
75.7%
#18957: fix(tts): replace deprecated ElevenLabs eleven_monolingual_v1 model
by BinHPdev · 2026-02-17
75.3%
#20992: fix(tts): apply TTS processing to agentCommand outbound delivery path
by mmyyfirstb · 2026-02-19
75.3%