#15881: fix(models): probe-safe cooldown handling and compatible fallback attribution

by wboudy open 2026-02-14 00:48 View on GitHub →

docs channel: telegram commands agents size: XL

## Problems - Fallback failure summaries could be interpreted as model-level failures even when the provider had no usable auth profiles, and grouped formatting risked breaking tooling that parses `provider/model: ...` segments. - `models status --probe` could mutate auth cooldown metadata, which polluted runtime profile availability after diagnostics. ## Fixes - Preserve fallback summary compatibility by keeping per-model `provider/model: ...` entries and marking provider-cooldown skips explicitly (`skipped (provider cooldown: all auth profiles in cooldown)`), so format remains stable. - Add explicit `probeMode` plumbing for probe runs, with backward compatibility (`probeMode ?? sessionId.startsWith("probe-")`), and suppress cooldown persistence in probe mode without changing rotation behavior. ## Notes - Format preserved: `provider/model: ...` - Probe semantics: explicit `probeMode` flag + backward compat fallback ## Tests - Added probe rotation coverage for 2-profile timeout scenario: probe mode rotates to the next profile and completes deterministically. - Added assertions that probe mode does **not** persist `cooldownUntil` / `lastFailureAt`, while `probeMode: false` explicitly preserves normal cooldown writes. - Kept fallback summary compatibility assertions to ensure provider-cooldown skips still appear as per-model `provider/model:` segments. ## Context (optional) - Issue draft and handoff are available in `docs/debug/` on this branch.  <h2>Greptile Overview</h2> <h3>Greptile Summary</h3> This PR refines two model/runtime behaviors: - **Fallback attribution stability**: `runWithModelFallback` now records provider-wide cooldown skips as per-candidate entries with a stable `provider/model: ...` format, using the explicit message `skipped (provider cooldown: all auth profiles in cooldown)` to avoid misattributing the condition as a model failure and to preserve downstream parsing expectations. - **Probe-safe auth rotation**: Adds an explicit `probeMode?: boolean` flag to embedded-run parameters and threads it through `runEmbeddedPiAgent` → `runEmbeddedAttempt` and embedded run tracking. In probe mode, cooldown metadata (`cooldownUntil` / `lastFailureAt`) is no longer persisted while keeping rotation behavior intact. `models status --probe` now passes `probeMode: true`, with backward-compatible heuristics (`probeMode ?? sessionId.startsWith("probe-")`). Also includes targeted tests covering probe rotation determinism and asserting that probe runs do not write cooldown penalties, while non-probe runs still do. <h3>Confidence Score: 5/5</h3> - This PR is safe to merge with minimal risk. - Changes are localized, maintain backward-compatible probe detection, preserve existing fallback summary structure, and add targeted tests to cover the new probe-mode semantics. No definite regressions were found in the updated call chains. - No files require special attention <sub>Last reviewed commit: fd26ee6</sub>  **Context used:** - Context from `dashboard` - CLAUDE.md ([source](https://app.greptile.com/review/custom-context?memory=fd949e91-5c3a-4ab5-90a1-cbe184fd6ce8)) - Context from `dashboard` - AGENTS.md ([source](https://app.greptile.com/review/custom-context?memory=0d0c8278-ef8e-4d6c-ab21-f5527e322f13))