← Back to PRs

#14914: fix: resolve actual failure reason for cooldown-skipped providers

by mcaxtr open 2026-02-12 19:26 View on GitHub →
agents size: S trusted-contributor experienced-contributor
Fixes #13909 ## Summary - When all auth profiles for a provider are in cooldown, the fallback loop hardcoded `reason: "rate_limit"` regardless of the actual failure that caused the cooldown (e.g. OAuth 403 → "auth", billing 402 → "billing") - Added `resolveDominantCooldownReason()` that inspects the `failureCounts` stored in profile usage stats and returns the most representative failure reason - Falls back to `"rate_limit"` when no failure data is recorded (backward-compatible default) ## Test plan - [x] New test: profile in cooldown with `failureCounts: { auth: 1 }` → attempt reason is `"auth"` - [x] New test: profile in cooldown with `failureCounts: { billing: 1 }` → attempt reason is `"billing"` - [x] Existing test: profile in cooldown with no `failureCounts` → reason remains `"rate_limit"` (backward compat) - [x] All 22 tests pass; 2 new tests fail before fix, pass after 🤖 Generated with [Claude Code](https://claude.com/claude-code) <!-- greptile_comment --> <h2>Greptile Overview</h2> <h3>Greptile Summary</h3> This PR changes model fallback behavior when a provider is skipped because *all* of its auth profiles are in cooldown/disabled. Previously, the skip attempt hardcoded `reason: "rate_limit"`; it now calls `resolveDominantCooldownReason()` which aggregates `usageStats[profileId].failureCounts` across the provider’s profiles and returns the highest-count `AuthProfileFailureReason`, defaulting to `"rate_limit"` when no usable failure data exists. Two new tests cover the new behavior for cooldowns driven by `auth` and `billing` failures, and the existing backward-compatible case (no `failureCounts`) remains `"rate_limit"`. <h3>Confidence Score: 5/5</h3> - This PR is safe to merge with minimal risk. - Change is narrow (only affects the cooldown-skip branch), reasons remain within the existing FailoverReason/AuthProfileFailureReason unions, and new tests cover the new behavior while preserving the prior default when failureCounts is absent/invalid. - No files require special attention <sub>Last reviewed commit: 05811a8</sub> <!-- greptile_other_comments_section --> <!-- /greptile_comment -->

Most Similar PRs