#8256: feat: Add rate limit strategy configuration

by revenuestack open 2026-02-03 20:11 View on GitHub →

agents stale

Cluster: Model Fallbacks and Rate Limiting

Add configurable rate limit handling that allows users to choose between: - "switch": Immediately try fallback model (default, existing behavior) - "wait": Parse Retry-After header and wait, then retry same model - "ask": Prompt user to choose between wait or switch This addresses issues where HTTP 429 errors don't trigger optimal recovery. The feature parses Retry-After headers from various AI providers (OpenAI, Anthropic) and respects configurable max wait times (default: 60s). Configuration: ```json { "agents": { "defaults": { "rateLimitStrategy": { "strategy": "wait", "maxWaitSeconds": 60, "backupModel": "openai/gpt-4" } } } } ```  <h2>Greptile Overview</h2> <h3>Greptile Summary</h3> This PR adds a configurable rate-limit handling layer to the model fallback path. It introduces a `rate-limit-handler` module that detects 429/402 (and message-based patterns), extracts retry timing from provider-specific headers (Retry-After, OpenAI x-ratelimit-reset-*, Anthropic *-reset), and decides whether to switch models, wait and retry once, or ask the caller for a decision. Config types + zod schema + UI hints are extended under `agents.defaults.rateLimitStrategy`. The main integration is in `runWithModelFallback`, which now optionally sleeps and retries the same provider/model once on rate limit depending on the configured strategy, otherwise proceeding through the existing candidate fallback list. <h3>Confidence Score: 3/5</h3> - Mostly safe to merge, but one key config knob appears non-functional and a couple of edge cases can cause surprising behavior. - Core logic is straightforward and covered by unit tests for parsing/decision-making, but the `backupModel` path isn’t wired into candidate selection (so the feature as documented won’t work), and rate-limit detection/headers extraction may misbehave for some real error shapes. - src/agents/model-fallback.ts, src/agents/rate-limit-handler.ts  <sub>(3/5) Reply to the agent's comments like "Can you suggest a fix for this @greptileai?" or ask follow-up questions!</sub>