#13658: fix: silent model failover with fallback notification

by taw0002 open 2026-02-10 20:26 View on GitHub →

stale

Cluster: Model Fallbacks and Rate Limiting

## Problem When model failover occurs (e.g., Anthropic billing/rate limit → OpenAI fallback), the user sees the raw billing error message as a standalone reply — even when the fallback model succeeds and completes the request normally. This creates a confusing UX where users get scary billing warnings followed by a normal response. Example of current behavior: > ⚠️ API provider returned a billing error — your API key has run out of credits or has an insufficient balance. Check your provider's billing dashboard and top up or switch to a different API key. ...immediately followed by a normal response from the fallback model. ## Solution ### 1. Subtle fallback notification on success When the primary model fails and a fallback model succeeds, prepend a brief note instead of showing the raw error: > 🔄 Fell back to openai/gpt-5.2 (primary model unavailable). This tells the user what happened without being alarming. The actual response follows normally. ### 2. Cleaner error messages when ALL models fail When every configured model fails, the error messages are now classified: - **Billing errors**: `⚠️ All configured models failed due to billing limits. Check your provider billing dashboards and top up credits.` - **Rate limit errors**: `⚠️ All configured models are currently rate-limited. Please wait a moment and try again.` - **Other errors**: Unchanged (`⚠️ Agent failed before reply: <message>`) ## Changes - `src/auto-reply/reply/agent-runner.ts`: Detect provider mismatch after successful fallback, prepend subtle fallback notification - `src/auto-reply/reply/agent-runner-execution.ts`: Classify billing/rate-limit errors in the catch-all block for cleaner user-facing messages ## Testing - All 45 existing agent-runner tests pass - All 19 model-fallback tests pass - Full typecheck + lint pass (`pnpm check`)  <h2>Greptile Overview</h2> <h3>Greptile Summary</h3> This PR improves UX for model failover scenarios by adding subtle success notifications and clearer error messages. **Changes:** - When primary model fails but fallback succeeds, users now see a brief `🔄 Fell back to <provider>/<model>` note instead of scary billing warnings - When all models fail, error messages are now classified by type (billing limits vs rate limits vs other) for clarity - The fallback notification is prepended to the response payload only when the provider actually changed from the original **Key improvements:** - Eliminates confusing UX where billing errors appeared alongside successful responses - Provides clearer guidance when failures occur (billing dashboard vs wait-and-retry) - Maintains existing error handling for context overflow and role ordering conflicts <h3>Confidence Score: 5/5</h3> - This PR is safe to merge with minimal risk - The changes are well-scoped improvements to user-facing messages with no modifications to core logic. The fallback detection uses existing variables that are already set by the model fallback mechanism, and the error classification leverages existing helper functions. All tests pass (45 agent-runner tests + 19 model-fallback tests), and the changes align with the repository's code style guidelines. - No files require special attention  <sub>(3/5) Reply to the agent's comments like "Can you suggest a fix for this @greptileai?" or ask follow-up questions!</sub>