← Back to PRs

#20388: fix(failover): don't skip same-provider fallback models when cooldown is rate-limit only

by Limitless2023 open 2026-02-18 21:12 View on GitHub →
agents size: S
## Problem When a model fails due to a rate limit or timeout, the auth profile for its provider enters a short cooldown. Previously, this cooldown caused **ALL models on that provider to be skipped** in `runWithModelFallback` — including fallback models that may be completely unaffected. ### Affected scenarios **Claude Max Plan (per-model token pools):** ``` primary: anthropic/claude-opus-4-6 fallbacks: [anthropic/claude-sonnet-4-6, ollama/qwen2.5-coder:14b] ``` Opus hits its monthly token pool limit → `anthropic:default` enters cooldown → Sonnet is skipped (same provider!) → falls through to Ollama which can't handle the full system prompt → silent failure / "How can I assist you today?" loop **Multi-provider failover:** Codex rate-limited → provider marked in cooldown → all Codex models AND configured Anthropic backups skipped simultaneously → no capable model available ## Root Cause `runWithModelFallback` used a single `isProfileInCooldown` check to gate **all** candidates for a provider. But there are two fundamentally different reasons an auth profile may be unavailable: 1. **Hard-disabled** (`disabledUntil` set) — billing or auth failure that means the credentials themselves are broken → skip ALL models (correct behavior) 2. **Soft cooldown** (`cooldownUntil` set) — rate limit or timeout, often model-specific → other models on the same provider may work fine ## Fix Introduce `isProfileHardDisabled()` that returns `true` only when `disabledUntil` is set. Update `runWithModelFallback` to: - Hard-disabled → skip all models for this provider (unchanged behavior) - Soft cooldown on **primary** model (i === 0) → existing probe logic unchanged - Soft cooldown on **non-primary fallback** (i > 0) → **allow the attempt**; the inner runner handles profile rotation from there ## Test scenario that now works ``` primary: anthropic/claude-opus-4-6 # exhausts Max Plan pool → rate_limit cooldown fallbacks: [anthropic/claude-sonnet-4-6] # same provider, different model pool → ✅ now tried ``` Fixes #20316 <!-- greptile_comment --> <h3>Greptile Summary</h3> This PR introduces a distinction between "hard-disabled" (billing/auth failure) and "soft-cooldown" (rate limit/timeout) auth profiles to prevent `runWithModelFallback` from skipping same-provider fallback models unnecessarily. It also fixes a separate bug in `sessions-list-tool.ts` where transcript path resolution failed when `storePath` was absent or contained a non-path placeholder like `"(multiple)"`. **Key changes:** - New `isProfileHardDisabled()` function that checks only `disabledUntil` (billing failures), distinct from `isProfileInCooldown()` which checks both `cooldownUntil` and `disabledUntil` - `runWithModelFallback` now skips all models on a provider only for hard-disabled profiles; non-primary fallback models bypass soft cooldown checks - `sessions-list-tool.ts` no longer requires `storePath` to resolve transcript paths and validates it is a real absolute path before using it **Issues found:** - The existing probe test (`model-fallback.probe.test.ts`) does not mock `isProfileHardDisabled`, which will cause `TypeError` failures since the mock factory replaces the entire module - The `.some()` check for hard-disabled profiles may over-aggressively skip models in multi-profile setups where only one profile has a billing failure <h3>Confidence Score: 2/5</h3> - This PR will break existing tests and has a potential logic issue in the core failover path - The missing `isProfileHardDisabled` mock in `model-fallback.probe.test.ts` will cause test failures (TypeError at runtime). The `.some()` vs `.every()` semantics in the hard-disabled check could cause incorrect provider skipping in multi-profile setups. The core approach is sound but needs these issues resolved before merging. - Pay close attention to `src/agents/model-fallback.ts` (core failover logic changes) and `src/agents/model-fallback.probe.test.ts` (broken mock) <sub>Last reviewed commit: 6522b6c</sub> <!-- greptile_other_comments_section --> <!-- /greptile_comment -->

Most Similar PRs