#19483: feat: enhance rate-limiting with provider-specific backoff and OpenRouter prompt caching

by nickf1227 open 2026-02-17 21:18 View on GitHub →

agents size: L

This PR introduces two significant improvements to OpenClaw's provider handling, motivated by high-load production scenarios (TrueNAS middleware benchmarking/vSinister orchestrator). ### 1. **Provider-Specific Retry Logic** Optimized backoff and retry strategies for Anthropic and OpenRouter. - **Cloudflare Protection:** Detects 1020 'Access Denied' blocks and applies a mandatory 60s minimum backoff. - **Provider Heuristics:** Properly parses `retry_after` from Anthropic and OpenRouter payloads. ### 2. **OpenRouter Prompt Caching** Automatic injection of `cache_control` headers for supported models (Gemini, Claude) via OpenRouter. - **Efficiency Gains:** Verified locally to achieve **100% cache efficiency** in long-running agent sessions. - **Cost/Latency Reduction:** Reduces processing overhead by caching system prompts and 90% of user history. ### Production Success Story Prior to these changes, heavy context 'churning' (vSinister project) resulted in frequent 429 Rate Limit cooldowns and 'API noise.' **After applying these patches, we achieved a 100% cache hit rate on Gemini-3-Flash, eliminating rate-limit failures even under deep context load.**