← Back to PRs

#23497: feat(retry): add retryHttpAsync utility with comprehensive coverage

by thinstripe open 2026-02-22 10:57 View on GitHub →
channel: slack cli commands agents size: M
This PR introduces a robust retry mechanism for HTTP fetch operations across the OpenClaw codebase. **Changes:** 1. **Infrastructure** — Adds utility: - New dependency: for readable status constants - Implements wrapper with automatic response validation - Supports retry for transient errors: network failures, rate limits (429), server errors (5xx), Cloudflare 522/524 - Returns for type safety - Extracted helpers: , , , 2. **Application** — Wraps all unprotected fetch calls: - 13+ locations updated with retry protection - Uses directly for custom return types (e.g., in web-fetch.ts) **Benefits:** - Improved resilience against transient network issues - Consistent backoff and retry behavior across services - Better observability with labeled retry attempts **Testing:** All changes are isolated to retry logic; existing functionality preserved. <!-- greptile_comment --> <h3>Greptile Summary</h3> Added `retryHttpAsync` utility to wrap fetch calls with automatic retry logic for transient HTTP failures (429, 5xx, network errors). Applied across 13+ locations including embeddings, OAuth flows, and media fetches. **Critical issue**: `retryHttpAsync` calls `validateResponseOk` after retries complete (retry-http.ts:79), which throws on non-OK responses. However, 14 call sites still check `if (!res.ok)` afterward - these checks are now unreachable dead code since `validateResponseOk` already threw. **Impact**: The redundant checks won't execute, but this creates confusion and changes error handling behavior. Some locations had custom error messages for specific status codes (e.g., qwen-portal 400 handling) that are now bypassed. - Removed import in tts-core.ts (lines 10-16) appears unrelated to this PR - No tests added for the new retry-http module <h3>Confidence Score: 2/5</h3> - Unsafe to merge - contains logic errors where error handling code becomes unreachable - 14 instances of unreachable error handling code due to `validateResponseOk` throwing before the checks. This changes behavior and loses custom error messages (e.g., Qwen OAuth 400 handling). The pattern is systematically broken across all usage sites. - All files with `retryHttpAsync` calls need attention: signal-install.ts, nodes-camera.ts, client-fetch.ts, batch-upload.ts, batch-voyage.ts (3 locations), embeddings-gemini.ts, embeddings-remote-fetch.ts, github-copilot-auth.ts (2 locations), qwen-portal-oauth.ts, tts-core.ts (2 locations) <sub>Last reviewed commit: 5f11943</sub> <!-- greptile_other_comments_section --> <sub>(3/5) Reply to the agent's comments like "Can you suggest a fix for this @greptileai?" or ask follow-up questions!</sub> <!-- /greptile_comment -->

Most Similar PRs