#21843: fix: add retry/backoff to Gemini embedding batch API calls
size: S
Cluster:
Gemini API Enhancements
## Summary
Wraps all Gemini batch embedding fetch calls with `retryAsync` — the same retry infrastructure that OpenAI and Voyage paths already use.
### What changed
All 4 raw `fetch()` calls in `batch-gemini.ts` now use `retryAsync` with:
- **3 attempts** with exponential backoff (300ms → 2s)
- **0.2 jitter** to avoid thundering herd
- **Retry on** 429 (rate limit) and 5xx (server errors)
- **Skip retry on** 404 for batch create (endpoint doesn't exist for this model)
### Why
The Gemini batch path was the only embedding provider path without retry/backoff. When Gemini hits its daily quota (429), errors would throw unhandled and spam on every session start/heartbeat, potentially destabilizing the gateway. OpenAI and Voyage already had this handled.
### Testing
- TypeScript compiles clean (`pnpm tsgo --noEmit`)
- No existing Gemini batch tests to break
- Follows exact same pattern as `batch-http.ts` (`postJsonWithRetry`)
### AI Disclosure
🤖 AI-assisted (Claude via OpenClaw). Code reviewed and understood by contributor.
Closes #15546
<!-- greptile_comment -->
<h3>Greptile Summary</h3>
Adds retry/backoff infrastructure to all Gemini batch embedding API calls. The implementation wraps 4 fetch operations (file upload, batch create, status polling, and file download) with `retryAsync`, using exponential backoff (300ms → 2s, 3 attempts, 0.2 jitter) that retries on 429 and 5xx errors. The batch create call includes a custom retry predicate that skips 404 errors (endpoint doesn't exist for the model). This brings Gemini batch handling in line with OpenAI/Voyage patterns that already use retry infrastructure, addressing unhandled errors when hitting daily quota limits.
<h3>Confidence Score: 4/5</h3>
- Safe to merge with one minor architectural inconsistency to consider
- The implementation correctly uses the existing retry infrastructure with appropriate parameters (exponential backoff, jitter, status-based retry logic). However, this PR adds retry to status polling and file download operations that OpenAI/Voyage implementations don't retry, which could theoretically extend timeout windows but is unlikely to cause issues in practice. The 404 special-case handling for batch create is well-implemented.
- No files require special attention
<sub>Last reviewed commit: 84c26e4</sub>
<!-- greptile_other_comments_section -->
<sub>(3/5) Reply to the agent's comments like "Can you suggest a fix for this @greptileai?" or ask follow-up questions!</sub>
<!-- /greptile_comment -->
Most Similar PRs
#15585: fix: add retry/backoff for Gemini embedding API calls
by WalterSumbon · 2026-02-13
95.1%
#8675: fix: Gemini batch embeddings state path, enum values, and download URL
by seasalim · 2026-02-04
80.4%
#23497: feat(retry): add retryHttpAsync utility with comprehensive coverage
by thinstripe · 2026-02-22
76.6%
#8309: fix: add emb_ prefix to batch embedding custom_id for OpenAI compli...
by vishaltandale00 · 2026-02-03
75.9%
#14314: fix(agent-runner): auto-recover from Gemini INVALID_ARGUMENT errors
by thebtf · 2026-02-11
74.4%
#11472: fix: retry media fetch on transient network errors
by openclaw-quenio · 2026-02-07
73.7%
#15301: Feat/gemini overflow and tags
by divisonofficer · 2026-02-13
73.7%
#12995: feat(infra): Add retry with exponential backoff for transient failures
by trevorgordon981 · 2026-02-10
73.5%
#16786: fix: support google-antigravity OAuth for Gemini embeddings
by outsourc-e · 2026-02-15
73.4%
#16239: fix: retry on transient API errors (overloaded, rate-limit, timeout)
by zerone0x · 2026-02-14
73.1%