#20946: fix: skip auth cooldown on timeout (not an auth failure)
agents
size: XS
Cluster:
Rate Limit Management Enhancements
## Summary
- Problem: When a provider request times out, the auth-profile usage tracker applies a cooldown as if it were an auth failure, blocking retries on a working provider.
- Why it matters: Timeouts are transient network issues, not credential problems — cooldown should not apply.
- What changed: Added a guard in `usage.ts` to skip cooldown when `reason` is `"timeout"`.
- What did NOT change (scope boundary): No other auth-profile logic, no provider changes.
Split out from #17333. This is a 3-line fix.
## Change Type (select all)
- [x] Bug fix
- [ ] Feature
- [ ] Refactor
- [ ] Docs
- [ ] Security hardening
- [ ] Chore/infra
## Scope (select all touched areas)
- [ ] Gateway / orchestration
- [ ] Skills / tool execution
- [x] Auth / tokens
- [ ] Memory / storage
- [ ] Integrations
- [ ] API / contracts
- [ ] UI / DX
- [ ] CI/CD / infra
## Linked Issue/PR
- Related #17333
## User-visible / Behavior Changes
Providers that timeout will no longer be temporarily blocked by the auth cooldown system. Retries will work immediately.
## Security Impact (required)
- New permissions/capabilities? No
- Secrets/tokens handling changed? No
- New/changed network calls? No
- Command/tool execution surface changed? No
- Data access scope changed? No
## Evidence
- 3-line change in `src/agents/auth-profiles/usage.ts`
## Compatibility / Migration
- Backward compatible? Yes
- Config/env changes? No
- Migration needed? No
## Risks and Mitigations
None — strictly additive guard on an existing code path.
## Failure Recovery (if this breaks)
- Revert this commit; timeouts will again trigger auth cooldowns (previous behavior).
<!-- greptile_comment -->
<h3>Greptile Summary</h3>
This PR correctly fixes timeout handling in the auth-profile cooldown system by skipping cooldown when `reason === "timeout"`. The 3-line change adds a guard in `computeNextProfileUsageStats` to treat timeouts as transient network issues rather than credential failures.
- The fix aligns with existing timeout handling logic in `pi-embedded-runner/run.ts:817` which already skips `markAuthProfileFailure` for timeouts
- The change properly preserves the failure count tracking (line 274) while preventing cooldown application
- The conditional logic now has three branches: billing failures get `disabledUntil`, timeouts get neither cooldown nor disable, and all other failures get `cooldownUntil`
<h3>Confidence Score: 5/5</h3>
- This PR is safe to merge with no risk
- The change is a minimal, well-scoped bug fix with clear intent and proper implementation. The logic correctly distinguishes between transient timeouts and actual auth failures. The existing test coverage for the cooldown system provides good regression protection, and the change aligns with existing timeout handling patterns elsewhere in the codebase.
- No files require special attention
<sub>Last reviewed commit: 2c73ed6</sub>
<!-- greptile_other_comments_section -->
<!-- /greptile_comment -->
Most Similar PRs
#14824: fix: do not trigger provider cooldown on LLM request timeouts
by CyberSinister · 2026-02-12
85.0%
#14914: fix: resolve actual failure reason for cooldown-skipped providers
by mcaxtr · 2026-02-12
84.5%
#23210: fix: avoid cooldown on timeout/unknown failovers
by nydamon · 2026-02-22
84.5%
#18902: fix: exempt format errors from auth profile cooldown
by tag-assistant · 2026-02-17
79.7%
#11371: Auth: cap rate-limit cooldown at 5 minutes; add maxCooldownMinutes ...
by lailoo · 2026-02-07
79.3%
#14574: fix: gentler rate-limit cooldown backoff + clear stale cooldowns on...
by JamesEBall · 2026-02-12
78.9%
#19267: fix: derive failover reason from timedOut flag to prevent unknown c...
by austenstone · 2026-02-17
78.8%
#14368: fix: skip auth profile cooldown on format errors to prevent provide...
by koatora20 · 2026-02-12
77.7%
#11874: fix: handle fetch rejections in provider usage withTimeout
by Zjianru · 2026-02-08
76.8%
#22359: fix(agents): classify overloaded service errors as timeout
by AIflow-Labs · 2026-02-21
75.5%