#14553: feat(llm): Add automatic retry mechanism for TPM/RPM rate limits
agents
size: S
Cluster:
Error Resilience and Retry Logic
### Summary
Add automatic retry mechanism for LLM API calls with exponential backoff to handle TPM/RPM rate limit errors.
Changes:
- Add retry config support in models.providers[].retry
- Add TPM-specific error patterns detection (tpm limit, tokens per minute, etc.)
- Create prompt-retry.ts utility with configurable retry:
- Default 10 attempts
- Exponential backoff with jitter
- Auto retry_after parsing from error response
- Wrap activeSession.prompt() calls with retry wrapper
### Config example:
```yaml
models:
providers:
openai:
retry:
attempts: 10
minDelayMs: 1000
maxDelayMs: 60000
jitter: 0.2
```
### Retryable errors:
- TPM/RPM rate limits
- 429 Too Many Requests
- Quota exceeded
- Resource exhausted
### Non-retryable:
- Authentication errors
- Context overflow
- Validation errors
<!-- greptile_comment -->
<h2>Greptile Overview</h2>
<h3>Greptile Summary</h3>
This change introduces a retry wrapper around `activeSession.prompt()` in the embedded runner to automatically retry on rate-limit style failures (TPM/RPM/429/quota/resource exhausted), using `retryAsync`’s exponential backoff with jitter. It also adds an optional per-provider retry config (`models.providers.<provider>.retry`) to the config types and Zod schema, and expands rate-limit error pattern matching to include TPM phrasing.
<h3>Confidence Score: 5/5</h3>
- This PR appears safe to merge; changes are localized and align with existing retry infrastructure.
- I reviewed all changed files and verified the new module import patterns match existing repo conventions (TS sources importing .js specifiers), the config schema/type additions are consistent, and the retry wrapper delegates to the existing `retryAsync` implementation without altering core request logic beyond adding retries on clearly rate-limit-related errors.
- No files require special attention
<!-- greptile_other_comments_section -->
<sub>(2/5) Greptile learns from your feedback when you react with thumbs up/down!</sub>
<!-- /greptile_comment -->
Most Similar PRs
#9025: Fix/automatic exponential backoff for LLM rate limits
by fotorpics · 2026-02-04
63.4%
#23152: feat(plugin): add retry-backoff extension
by cintia09 · 2026-02-22
63.3%
#13686: Add opt-in rate limiting and token-based budgets for external API c...
by ShresthSamyak · 2026-02-10
58.2%
#12995: feat(infra): Add retry with exponential backoff for transient failures
by trevorgordon981 · 2026-02-10
58.1%
#16195: feat(infra): add unified retry utility with exponential backoff
by bianbiandashen · 2026-02-14
57.1%
#8256: feat: Add rate limit strategy configuration
by revenuestack · 2026-02-03
56.1%
#22368: fix: first-token timeout + provider-level skip for model fallback
by 88plug · 2026-02-21
55.4%
#9482: feat: add cloud code assist retry logic and parsing for rate limit ...
by mrcha033 · 2026-02-05
55.3%
#23497: feat(retry): add retryHttpAsync utility with comprehensive coverage
by thinstripe · 2026-02-22
54.8%
#16239: fix: retry on transient API errors (overloaded, rate-limit, timeout)
by zerone0x · 2026-02-14
54.8%