← Back to PRs

#21843: fix: add retry/backoff to Gemini embedding batch API calls

by slegarraga open 2026-02-20 12:56 View on GitHub →
size: S
## Summary Wraps all Gemini batch embedding fetch calls with `retryAsync` — the same retry infrastructure that OpenAI and Voyage paths already use. ### What changed All 4 raw `fetch()` calls in `batch-gemini.ts` now use `retryAsync` with: - **3 attempts** with exponential backoff (300ms → 2s) - **0.2 jitter** to avoid thundering herd - **Retry on** 429 (rate limit) and 5xx (server errors) - **Skip retry on** 404 for batch create (endpoint doesn't exist for this model) ### Why The Gemini batch path was the only embedding provider path without retry/backoff. When Gemini hits its daily quota (429), errors would throw unhandled and spam on every session start/heartbeat, potentially destabilizing the gateway. OpenAI and Voyage already had this handled. ### Testing - TypeScript compiles clean (`pnpm tsgo --noEmit`) - No existing Gemini batch tests to break - Follows exact same pattern as `batch-http.ts` (`postJsonWithRetry`) ### AI Disclosure 🤖 AI-assisted (Claude via OpenClaw). Code reviewed and understood by contributor. Closes #15546 <!-- greptile_comment --> <h3>Greptile Summary</h3> Adds retry/backoff infrastructure to all Gemini batch embedding API calls. The implementation wraps 4 fetch operations (file upload, batch create, status polling, and file download) with `retryAsync`, using exponential backoff (300ms → 2s, 3 attempts, 0.2 jitter) that retries on 429 and 5xx errors. The batch create call includes a custom retry predicate that skips 404 errors (endpoint doesn't exist for the model). This brings Gemini batch handling in line with OpenAI/Voyage patterns that already use retry infrastructure, addressing unhandled errors when hitting daily quota limits. <h3>Confidence Score: 4/5</h3> - Safe to merge with one minor architectural inconsistency to consider - The implementation correctly uses the existing retry infrastructure with appropriate parameters (exponential backoff, jitter, status-based retry logic). However, this PR adds retry to status polling and file download operations that OpenAI/Voyage implementations don't retry, which could theoretically extend timeout windows but is unlikely to cause issues in practice. The 404 special-case handling for batch create is well-implemented. - No files require special attention <sub>Last reviewed commit: 84c26e4</sub> <!-- greptile_other_comments_section --> <sub>(3/5) Reply to the agent's comments like "Can you suggest a fix for this @greptileai?" or ask follow-up questions!</sub> <!-- /greptile_comment -->

Most Similar PRs