#23318: feat(agents): emit model:fallback hook event for fallback visibility (#22805)

by anillBhoi open 2026-02-22 06:15 View on GitHub →

commands agents size: S

Cluster: Model Fallbacks and Rate Limiting

## Problem When OpenClaw falls back to a secondary model due to rate limits, timeouts, or other failures, this happened completely silently. Users had no visibility into when fallbacks occurred, which model was used, or why the primary failed. ## Solution Implements Option B from the feature request — a structured hook event. ### New hook event: `model:fallback` Fires whenever a model fallback attempt fails. Payload includes: - `primaryModel` — the model that failed - `fallbackModel` — next candidate being tried - `reason` — rate_limit, timeout, etc. - `attempt` / `total` — position in fallback chain - `error` — full error message - `label` — which runner triggered it ### Changes - Added `"model"` to `InternalHookEventType` - Added `ModelFallbackHookContext` and `ModelFallbackHookEvent` types - Added `createModelFallbackLogger()` helper in `model-fallback.ts` - Wired as `onError` at all 5 call sites: - `agent-runner-execution.ts` - `agent-runner-memory.ts` - `followup-runner.ts` - `commands/agent.ts` - `cron/isolated-agent/run.ts` ## Tests - 11 existing probe/fallback tests pass with no regressions ## Closes Closes #22805  <h3>Greptile Summary</h3> This PR adds a `model:fallback` hook event that fires whenever a model fallback attempt fails, providing visibility into fallback behavior. The implementation correctly wires the hook into all 5 runner call sites and includes structured context about the failure. Key changes: - Added `"model"` to `InternalHookEventType` enum - Created `ModelFallbackHookContext` and `ModelFallbackHookEvent` types - Implemented `createModelFallbackLogger()` helper in `model-fallback.ts` - Wired hook to all 5 runners: agent-runner-execution, agent-runner-memory, followup-runner, agent command, and cron isolated-agent The implementation is sound with proper error handling and follows existing hook patterns. One minor documentation inconsistency noted regarding the `fallbackModel` field. <h3>Confidence Score: 4/5</h3> - This PR is safe to merge with minimal risk - The implementation follows existing patterns, adds proper type definitions, and doesn't modify any core logic - only adds observability. The hook is called after errors are already handled and logged, so it cannot break existing behavior. The PR states 11 existing tests pass with no regressions. Minor style suggestion regarding type documentation consistency, but no functional issues. - No files require special attention <sub>Last reviewed commit: 592fbda</sub>