#11693: Model Provider Failover for Default and Session Model When Rate Limiting or Other Errors Occur
agents
stale
Cluster:
Model Management Enhancements
The README documents two new Node.js failover utilities added under update/. default-model-failover.js probes a prioritized list of candidate models and updates the global OpenClaw config's agents.defaults.model.primary (preserving existing keys like fallbacks), with automatic timestamped backups before any modification. session-failover.js performs the same probe-and-switch logic for individual session entries in a sessions JSON store, mirroring OpenClaw's applyModelOverrideToSessionEntry behavior including auth profile cleanup. The README covers usage examples for both scripts, explains the lightweight provider-specific probing strategy (HTTP checks for OpenAI/Anthropic, env var heuristics for others), documents environment variable caveats for non-standard provider key names, and includes a safety checklist for operators before running or automating the utilities.
[README.md](https://github.com/user-attachments/files/25159062/README.md)
<!-- greptile_comment -->
<h2>Greptile Overview</h2>
<h3>Greptile Summary</h3>
This PR adds two Node.js utilities under `update/` for operational model failover:
- `update/default-model-failover.js` probes a prioritized list of provider/model pairs and updates the global OpenClaw config’s `agents.defaults.model.primary`, creating a timestamped backup before writing.
- `update/session-failover.js` probes candidates and writes per-session `providerOverride`/`modelOverride` into a sessions JSON store, attempting to mirror core session override behavior.
The implementation follows existing config/session key conventions in the repo (e.g., `agents.defaults.model.primary`, session override fields), but there are a couple of correctness mismatches vs documented/core behavior that should be addressed before merging.
<h3>Confidence Score: 4/5</h3>
- Mostly safe to merge, but fix session override cleanup parity and correct the README claim about backups.
- Changes are isolated to new `update/` utilities and documentation, with no runtime impact on the main app. Main remaining risk is that `session-failover.js` claims to mirror core behavior but currently leaves stale auth override metadata, which can alter effective auth/profile selection after failover; and README contains a concrete behavior guarantee that isn’t true when the config file doesn’t exist.
- update/session-failover.js, update/README.md
<!-- greptile_other_comments_section -->
<sub>(2/5) Greptile learns from your feedback when you react with thumbs up/down!</sub>
<!-- /greptile_comment -->
Most Similar PRs
#9822: fix: allow local/custom model providers for sub-agent inference
by stammtobias91 · 2026-02-05
78.1%
#9739: #9291 fix(models): preserve existing models in models.json when mer...
by ximzzzzz · 2026-02-05
77.4%
#13658: fix: silent model failover with fallback notification
by taw0002 · 2026-02-10
77.0%
#12059: feat(agents): Add Azure AI Foundry credential support
by lisanyambere · 2026-02-08
76.6%
#8256: feat: Add rate limit strategy configuration
by revenuestack · 2026-02-03
76.4%
#16838: fix: include configured fallbacks in model allowlist
by taw0002 · 2026-02-15
76.4%
#16797: fix(auth-profiles): implement per-model rate limit cooldown tracking
by mulhamna · 2026-02-15
76.3%
#8390: feat: notify user when fallback model is used (#8182)
by Glucksberg · 2026-02-04
76.1%
#7113: feat(providers): add CommonStack provider support
by flhoildy · 2026-02-02
75.9%
#21963: fix(cli): models fallbacks add now includes primary model in allowlist
by ashiabbott · 2026-02-20
75.8%