#16289: feat: heartbeat model fallback chain support
stale
size: M
Cluster:
Model Fallbacks and Rate Limiting
## Summary
Extends heartbeat configuration to support a primary model + fallback chain when the primary fails.
## Config formats
**Backward compatible** (existing format still works):
```json
"heartbeat": { "model": "openrouter/tngtech/tng-r1t-chimera:free" }
```
**New format with fallbacks:**
```json
"heartbeat": {
"primary": "openrouter/tngtech/tng-r1t-chimera:free",
"fallbacks": [
"openrouter/google/gemma-2-2b-it:free",
"kimi-coding/k2p5",
"anthropic/claude-sonnet-4-5",
"openrouter/openrouter/auto"
],
"fallbackMode": "immediate"
}
```
## Fallback modes
- **`immediate`** (default): On primary failure, try next fallback right away in the same heartbeat poll
- **`next_heartbeat`**: On failure, save the next model index and try it on the next poll interval
## State persistence
Per-agent state is persisted in `~/.openclaw/heartbeat-state.json` to track which model to use next (for `next_heartbeat` mode) and reset to primary on success.
## Files changed
- `src/config/zod-schema.agent-runtime.ts` — adds `primary`, `fallbacks`, `fallbackMode` to HeartbeatSchema
- `src/config/types.agent-defaults.ts` — TypeScript types for new fields
- `src/infra/heartbeat-runner.ts` — core fallback logic implementation
- `HeartbeatModelState` / `HeartbeatState` types
- `resolveHeartbeatModelChain()` — builds ordered model list
- `getHeartbeatModelState()` / `updateHeartbeatModelState()` — state persistence
- Updated `runHeartbeatOnce()\ with model fallback loop
<!-- greptile_comment -->
<h3>Greptile Summary</h3>
Adds fallback chain support to heartbeat model configuration, allowing the system to automatically try alternative models when the primary fails. The implementation supports both backward-compatible legacy `model` field and new `primary`/`fallbacks` fields, with two fallback modes: `immediate` (retry in same poll) and `next_heartbeat` (rotate to next model in subsequent poll). Per-agent state is persisted in `heartbeat-state.json` files.
- Extends `HeartbeatSchema` and types with `primary`, `fallbacks`, and `fallbackMode` fields
- Implements model chain resolution logic that prefers `primary` over legacy `model`
- Adds state persistence functions for tracking fallback progression in `next_heartbeat` mode
- Implements immediate-mode fallback loop that tries each model sequentially until success
- Implements next-heartbeat-mode rotation that advances through models across polling intervals
- Properly handles `suppressToolErrorWarnings` flag in both immediate and next_heartbeat branches
<h3>Confidence Score: 4/5</h3>
- This PR is safe to merge with low risk
- The implementation is well-structured and properly handles edge cases (state corruption, missing files, non-failover errors). Previous review threads show that important issues (missing imports, directory creation, dead fields) have been addressed in commits 717524b and ef6101d. The backward compatibility is preserved, state management is defensive with proper error handling, and the fallback logic correctly uses `isFailoverError` to distinguish retriable failures
- No files require special attention
<sub>Last reviewed commit: 717524b</sub>
<!-- greptile_other_comments_section -->
<!-- /greptile_comment -->
Most Similar PRs
#9486: feat(heartbeat): support primary/fallbacks model config
by sauerdaniel · 2026-02-05
81.1%
#23738: feat(fallback): first-class transition visibility + low-noise autom...
by SmithLabsLLC · 2026-02-22
78.8%
#8390: feat: notify user when fallback model is used (#8182)
by Glucksberg · 2026-02-04
77.2%
#23318: feat(agents): emit model:fallback hook event for fallback visibilit...
by anillBhoi · 2026-02-22
76.1%
#21963: fix(cli): models fallbacks add now includes primary model in allowlist
by ashiabbott · 2026-02-20
76.1%
#20275: fix(cli): include primary model in allowlist when adding fallbacks
by MFS-code · 2026-02-18
75.7%
#19252: fix(agents): continue model fallback on failover text payloads
by mahsumaktas · 2026-02-17
75.3%
#9429: fix: skip session model override for heartbeat runs
by dbottme · 2026-02-05
75.2%
#21615: fix(tui): preserve main session model during heartbeat model override
by lailoo · 2026-02-20
74.1%
#22064: fix(failover): bypass models allowlist for configured fallback models
by winston-bepresent · 2026-02-20
74.1%