#15881: fix(models): probe-safe cooldown handling and compatible fallback attribution
docs
channel: telegram
commands
agents
size: XL
Cluster:
Model Cooldown Management
## Problems
- Fallback failure summaries could be interpreted as model-level failures even when the provider had no usable auth profiles, and grouped formatting risked breaking tooling that parses `provider/model: ...` segments.
- `models status --probe` could mutate auth cooldown metadata, which polluted runtime profile availability after diagnostics.
## Fixes
- Preserve fallback summary compatibility by keeping per-model `provider/model: ...` entries and marking provider-cooldown skips explicitly (`skipped (provider cooldown: all auth profiles in cooldown)`), so format remains stable.
- Add explicit `probeMode` plumbing for probe runs, with backward compatibility (`probeMode ?? sessionId.startsWith("probe-")`), and suppress cooldown persistence in probe mode without changing rotation behavior.
## Notes
- Format preserved: `provider/model: ...`
- Probe semantics: explicit `probeMode` flag + backward compat fallback
## Tests
- Added probe rotation coverage for 2-profile timeout scenario: probe mode rotates to the next profile and completes deterministically.
- Added assertions that probe mode does **not** persist `cooldownUntil` / `lastFailureAt`, while `probeMode: false` explicitly preserves normal cooldown writes.
- Kept fallback summary compatibility assertions to ensure provider-cooldown skips still appear as per-model `provider/model:` segments.
## Context (optional)
- Issue draft and handoff are available in `docs/debug/` on this branch.
<!-- greptile_comment -->
<h2>Greptile Overview</h2>
<h3>Greptile Summary</h3>
This PR refines two model/runtime behaviors:
- **Fallback attribution stability**: `runWithModelFallback` now records provider-wide cooldown skips as per-candidate entries with a stable `provider/model: ...` format, using the explicit message `skipped (provider cooldown: all auth profiles in cooldown)` to avoid misattributing the condition as a model failure and to preserve downstream parsing expectations.
- **Probe-safe auth rotation**: Adds an explicit `probeMode?: boolean` flag to embedded-run parameters and threads it through `runEmbeddedPiAgent` → `runEmbeddedAttempt` and embedded run tracking. In probe mode, cooldown metadata (`cooldownUntil` / `lastFailureAt`) is no longer persisted while keeping rotation behavior intact. `models status --probe` now passes `probeMode: true`, with backward-compatible heuristics (`probeMode ?? sessionId.startsWith("probe-")`).
Also includes targeted tests covering probe rotation determinism and asserting that probe runs do not write cooldown penalties, while non-probe runs still do.
<h3>Confidence Score: 5/5</h3>
- This PR is safe to merge with minimal risk.
- Changes are localized, maintain backward-compatible probe detection, preserve existing fallback summary structure, and add targeted tests to cover the new probe-mode semantics. No definite regressions were found in the updated call chains.
- No files require special attention
<sub>Last reviewed commit: fd26ee6</sub>
<!-- greptile_other_comments_section -->
**Context used:**
- Context from `dashboard` - CLAUDE.md ([source](https://app.greptile.com/review/custom-context?memory=fd949e91-5c3a-4ab5-90a1-cbe184fd6ce8))
- Context from `dashboard` - AGENTS.md ([source](https://app.greptile.com/review/custom-context?memory=0d0c8278-ef8e-4d6c-ab21-f5527e322f13))
<!-- /greptile_comment -->
Most Similar PRs
#13077: fix: prevent cooldown pollution across different models on the same...
by magendary · 2026-02-10
81.6%
#19267: fix: derive failover reason from timedOut flag to prevent unknown c...
by austenstone · 2026-02-17
80.1%
#14824: fix: do not trigger provider cooldown on LLM request timeouts
by CyberSinister · 2026-02-12
79.9%
#20388: fix(failover): don't skip same-provider fallback models when cooldo...
by Limitless2023 · 2026-02-18
79.8%
#14914: fix: resolve actual failure reason for cooldown-skipped providers
by mcaxtr · 2026-02-12
79.1%
#23816: fix(agents): model fallback skipped during session overrides and pr...
by ramezgaberiel · 2026-02-22
78.8%
#14574: fix: gentler rate-limit cooldown backoff + clear stale cooldowns on...
by JamesEBall · 2026-02-12
77.7%
#18902: fix: exempt format errors from auth profile cooldown
by tag-assistant · 2026-02-17
77.4%
#16797: fix(auth-profiles): implement per-model rate limit cooldown tracking
by mulhamna · 2026-02-15
77.3%
#4462: fix: prevent gateway crash when all auth profiles are in cooldown
by garnetlyx · 2026-01-30
77.2%