← Back to PRs

#21049: fix(failover): treat HTTP 5xx as rate-limit for model fallback

by maximalmargin open 2026-02-19 16:02 View on GitHub →
agents size: XS
## Summary Treat HTTP 502/503/504 as failover-eligible (rate_limit reason) so configured model fallbacks trigger when the primary provider is overloaded or temporarily unavailable. ## Changes - Added handling for status codes 502, 503, 504 in `resolveFailoverReasonFromError()` - Treats these as `rate_limit` failures to enable existing fallback/cooldown behavior ## Fixes Closes #20999 <!-- greptile_comment --> <h3>Greptile Summary</h3> Adds handling for HTTP 502 (Bad Gateway), 503 (Service Unavailable), and 504 (Gateway Timeout) status codes in `resolveFailoverReasonFromError()`, treating them as `rate_limit` failures to enable model fallback when the primary provider is overloaded or temporarily unavailable. - Maps 502/503/504 errors to `rate_limit` reason, which triggers the existing failover/cooldown behavior in `runWithModelFallback()` - Aligns with existing `isTransientHttpError()` logic in `pi-embedded-helpers/errors.ts` which treats 500, 502, 503 and Cloudflare 5xx codes as transient failures (mapped to timeout) - Simple, focused change that enables configured model fallbacks to activate on server-side failures <h3>Confidence Score: 5/5</h3> - This PR is safe to merge with minimal risk - The change is a straightforward, logical extension to error classification that properly handles server-side errors. It maps 502/503/504 status codes to rate_limit reason, which is semantically appropriate for temporary unavailability. The code follows existing patterns, has clear comments, and integrates seamlessly with the existing failover infrastructure. The implementation is simple and doesn't introduce new dependencies or complex logic. - No files require special attention <sub>Last reviewed commit: b09e1b0</sub> <!-- greptile_other_comments_section --> <!-- /greptile_comment -->

Most Similar PRs