← Back to PRs

#16195: feat(infra): add unified retry utility with exponential backoff

by bianbiandashen open 2026-02-14 13:01 View on GitHub →
stale size: M
## Summary Add a reusable `withRetry<T>()` function that provides a unified retry strategy across the codebase. **Key features:** - Generic async retry wrapper with configurable max attempts - Exponential backoff with jitter using existing `computeBackoff()` from backoff.ts - Abort signal support for early cancellation - Customizable `shouldRetry` predicate for fine-grained control - `onRetry` callback for logging/metrics integration - `RetryExhaustedError` for clear error handling **Included retry predicates:** - `retryPredicates.networkErrors` - Matches ECONNRESET, ETIMEDOUT, DNS failures, etc. - `retryPredicates.serverErrors` - Matches HTTP 5xx and 429 (rate limit) - `retryPredicates.any()` - Combines multiple predicates with OR logic ## Motivation Currently the codebase has `computeBackoff` and `sleepWithAbort` as building blocks, but each caller must implement their own retry loop. This leads to: - Inconsistent retry behavior across modules - Duplicated error handling logic - Easy to miss edge cases (abort handling, max attempts, etc.) This utility provides a single, well-tested implementation that reuses existing backoff infrastructure. ## Example Usage ```typescript import { withRetry, retryPredicates } from "./infra/retry.js"; const result = await withRetry( () => fetch(url), { maxAttempts: 5, shouldRetry: retryPredicates.any( retryPredicates.networkErrors, retryPredicates.serverErrors ), onRetry: (err, attempt, delay) => { logger.warn(`Attempt ${attempt} failed, retrying in ${delay}ms`); }, } ); ``` ## Test Plan - [x] Unit tests for success on first attempt - [x] Unit tests for retry and eventual success - [x] Unit tests for RetryExhaustedError when all attempts fail - [x] Unit tests for shouldRetry predicate behavior - [x] Unit tests for onRetry callback invocation - [x] Unit tests for abort signal handling - [x] Unit tests for retry predicates (networkErrors, serverErrors, any) <!-- greptile_comment --> <h3>Greptile Summary</h3> This PR replaces the existing `retryAsync` retry utility with a new `withRetry` function that provides exponential backoff with jitter, abort signal support, and helpful retry predicates. The implementation is clean and well-tested with comprehensive unit tests. **Critical issue:** - Complete removal of the old API (`retryAsync`, `RetryConfig`, `RetryInfo`, `resolveRetryConfig`) that is actively used in 15+ files across the codebase will cause build failures **Key changes:** - New `withRetry<T>()` function with `RetryOptions` configuration - Added `RetryExhaustedError` for clear error handling - Reuses existing `computeBackoff()` and `sleepWithAbort()` from `backoff.ts` - Provides common retry predicates: `networkErrors`, `serverErrors`, and `any()` - Includes comprehensive unit tests covering success, retry, exhaustion, abort, and predicate behavior <h3>Confidence Score: 0/5</h3> - This PR cannot be merged due to breaking API changes that will cause widespread build failures - The complete removal of `retryAsync`, `RetryConfig`, `RetryInfo`, and `resolveRetryConfig` breaks 15+ files including `retry-policy.ts`, `batch-openai.ts`, `batch-voyage.ts`, `discord/api.ts`, and various Discord/Telegram send files. While the new code is well-implemented and tested, merging without migration will break the build. - All files using the old retry API need migration before this PR can merge, particularly `src/infra/retry-policy.ts` which provides retry runners for Discord and Telegram <sub>Last reviewed commit: f2baf88</sub> <!-- greptile_other_comments_section --> <!-- /greptile_comment -->

Most Similar PRs