#20428: feat: capture Anthropic rate-limit response headers to disk

by AndrewArto open 2026-02-18 22:20 View on GitHub →

agents size: L

## Problem For users on **Claude Max** (subscription via OAuth / Claude Code), there is **no programmatic API** to query remaining usage quota. The Anthropic `/usage` page shows session limits, weekly caps, and extra-usage budget, but this data is not available via any API endpoint. The **only machine-readable signal** about rate limits comes from HTTP response headers that Anthropic returns on every API call: ``` anthropic-ratelimit-unified-limit anthropic-ratelimit-unified-remaining anthropic-ratelimit-unified-reset anthropic-ratelimit-unified-tokens-limit anthropic-ratelimit-unified-tokens-remaining anthropic-ratelimit-unified-tokens-reset ``` These headers are currently **discarded** — the Anthropic Node SDK does not surface response headers to callers, and pi-ai (the streaming layer) does not expose them either. ## Solution Add a new module (`anthropic-ratelimit.ts`) that captures these headers using a scoped `fetch` wrapper: 1. **Before** each Anthropic streaming API call, a lightweight fetch hook is installed 2. The hook intercepts the HTTP response, extracts any `anthropic-ratelimit-*` / `retry-after` / `x-ratelimit-*` headers 3. **After** the stream ends (done/error/break), the hook is uninstalled 4. Captured headers are atomically written to `<stateDir>/anthropic-ratelimit.json` ### Output format ```json { "ts": "2025-02-18T14:30:00.000Z", "headers": { "anthropic-ratelimit-unified-limit": "1000", "anthropic-ratelimit-unified-remaining": "750", "anthropic-ratelimit-unified-reset": "2025-02-18T15:00:00Z", "anthropic-ratelimit-unified-tokens-limit": "100000", "anthropic-ratelimit-unified-tokens-remaining": "85000", "anthropic-ratelimit-unified-tokens-reset": "2025-02-18T14:35:00Z" }, "sessionKey": "main:telegram:...", "modelId": "claude-opus-4-6" } ``` ### Design choices - **Always active** for Anthropic models — no env-var opt-in. The overhead is minimal (one `JSON.stringify` + one atomic file write per API call) - **Scoped fetch wrapper** — only installed for the duration of a single streaming call, not globally - **Separate from payload logging** — `OPENCLAW_ANTHROPIC_PAYLOAD_LOG` controls detailed request/response logging; rate-limit capture is independent - **Atomic writes** — uses write-to-temp + rename to prevent corrupt reads - **Read helper** — `readRatelimitSnapshot()` exported for internal use ## Use cases - **Budget dashboards** — menu-bar apps or CLI tools can read `anthropic-ratelimit.json` to show remaining quota - **Monitoring** — alert when approaching rate limits before hitting them - **Usage tracking** — external tools can correlate rate-limit data with session JSONL token counts - **Auto-throttling** — future OpenClaw features could use this to pace requests ## Changes | File | Change | |------|--------| | `src/agents/anthropic-ratelimit.ts` | New module: fetch hook, snapshot writer, reader | | `src/agents/anthropic-ratelimit.test.ts` | 5 unit tests | | `src/agents/anthropic-payload-log.ts` | Export `createRatelimitStreamWrapper` + import | | `src/agents/pi-embedded-runner/run/attempt.ts` | Wire up ratelimit wrapper for all Anthropic calls | ## Testing - 5 unit tests covering: header capture, non-Anthropic URL filtering, fetch restoration, no-header graceful handling, idempotent install/uninstall - Full TypeScript compilation passes with zero errors - Existing test suite unaffected ## References - [claude-code#19385](https://github.com/anthropics/claude-code/issues/19385) — no programmatic access to ratelimit data - [Anthropic rate limits docs](https://docs.anthropic.com/en/api/rate-limits) — header format specification  <h3>Greptile Summary</h3> Adds a new `anthropic-ratelimit.ts` module that captures Anthropic rate-limit response headers (`anthropic-ratelimit-*`) by temporarily wrapping `globalThis.fetch` during streaming API calls, and writes snapshots to `<stateDir>/anthropic-ratelimit.json` for external consumption by dashboards and CLI tools. The wrapper is wired into `attempt.ts` unconditionally for all Anthropic model calls. - **`isAnthropicUrl` false positives**: The URL check `href.includes("anthropic")` matches third-party Anthropic-compatible providers (MiniMax, Xiaomi, Cloudflare AI Gateway) configured in `models-config.providers.ts` that use `api: "anthropic-messages"`. Their rate-limit headers would overwrite the snapshot with non-Anthropic data. - **Concurrent session race condition**: The `globalThis.fetch` wrapping has no reference counting or stack mechanism. When multiple sessions stream concurrently (possible across different lanes), install/uninstall ordering can corrupt the fetch chain, leaving stale hooks installed or silently removing active hooks. - **Synchronous file I/O on hot path**: Uses `writeFileSync`/`renameSync` inside the async fetch wrapper, unlike the existing payload logger which uses an async `QueuedFileWriter`. This blocks the event loop on every API call. <h3>Confidence Score: 2/5</h3> - This PR has two logical issues (URL matching and concurrency) that should be addressed before merging. - Score of 2 reflects: (1) the isAnthropicUrl function will match third-party Anthropic-compatible providers that exist in this codebase, writing incorrect rate-limit data; (2) the globalThis.fetch wrapping is not safe under concurrent sessions, which can corrupt the fetch chain; (3) synchronous file I/O is a less critical but notable deviation from established patterns. The core feature concept is sound but the implementation needs fixes. - src/agents/anthropic-ratelimit.ts needs the most attention — both the URL matching logic and the fetch wrapping concurrency model need to be fixed. <sub>Last reviewed commit: d8d8664</sub>  <sub>(2/5) Greptile learns from your feedback when you react with thumbs up/down!</sub>