← Back to PRs

#13077: fix: prevent cooldown pollution across different models on the same provider

by magendary open 2026-02-10 03:15 View on GitHub →
agents stale
## Summary When model fallback iterates candidates on the same provider but different models (e.g. Claude → Gemini on google-antigravity), cooldown set by the first model incorrectly blocks attempts with subsequent models. This happens because cooldown is tracked per-profile (not per-model), and the pre-flight check skips **all** candidates on a provider once any cooldown is active — even though different models may use entirely independent quota pools. ## Problem ``` Claude 4.6 rate-limited → profile enters cooldown → fallback to Gemini Pro (same provider) → cooldown check sees same profile in cooldown → skips → Gemini Pro quota is full but never tried ``` The same issue affects any provider with multiple model tiers (e.g. Flash ↔ Pro fallbacks). ## Fix Track which providers have already been cooldown-checked during the fallback loop. Only perform the cooldown pre-check on the **first** candidate for each provider. Subsequent candidates on the same provider (with a different model) skip the pre-check and proceed to attempt execution directly. This is the minimal, conservative fix — it doesn't alter cooldown state or profile tracking, just prevents the pre-flight check from blocking unrelated models. ## Test Added test: `"does not skip different model on same provider due to cooldown from first model"` — verifies that when `provider/model-a` profiles are in cooldown, a fallback to `provider/model-b` is still attempted and succeeds. All existing cooldown tests continue to pass (the cross-provider cooldown skip behavior is unchanged). <!-- greptile_comment --> <h2>Greptile Overview</h2> <h3>Greptile Summary</h3> This PR changes `runWithModelFallback` to avoid “cooldown pollution” when iterating multiple model candidates on the same provider by only performing the provider-wide cooldown pre-check once per provider, rather than once per candidate. It also adds a regression test to ensure a fallback to a different model on the same provider is still attempted. The change fits into the existing model-fallback flow by adjusting the early “skip provider if all profiles are in cooldown” gate, without changing how failover errors are normalized/recorded, and without changing auth store semantics. <h3>Confidence Score: 3/5</h3> - This PR is directionally correct but may change runtime behavior in cases where a provider has no available profiles. - The new provider-level cooldown check suppression can cause `params.run` to be called for models on a provider even when all profiles are still in cooldown, which can regress previous skip behavior depending on how credentials are selected. The added test currently doesn’t exercise real credential availability, so it may not protect against that regression. - src/agents/model-fallback.ts, src/agents/model-fallback.test.ts <!-- greptile_other_comments_section --> <!-- /greptile_comment -->

Most Similar PRs