← Back to PRs

#19655: Fix/context window provider keying

by pharasyte open 2026-02-18 02:02 View on GitHub →
gateway commands agents size: S
## Summary - **Problem:** The model context-window cache used bare model IDs as keys, so the same model ID from multiple providers (e.g. `claude-sonnet-4-5` via both Anthropic and a custom provider) would collide, causing one provider's configured limit to silently overwrite another's. - **Why it matters:** Incorrect token budgets cause premature context compaction (too-small) or context overflow errors (too-large), with no warning surfaced to the user. Affected any deployment using custom providers or OpenRouter alongside native providers with overlapping model IDs. - **What changed:** Cache keys are now `provider/modelId` (two-tier lookup: qualified first, bare fallback). Provider is threaded through to all `lookupContextTokens` call sites. A `_lookupFromCache` pure helper was extracted for testability. Tests were updated to match new key format and cover the two-tier logic. - **What did NOT change:** The fallback to bare model-ID keys is preserved for callers without provider context. `DEFAULT_CONTEXT_TOKENS` fallback behavior is unchanged. No config schema changes, no API surface changes, no behavior change for single-provider deployments. ## Change Type (select all) - [x] Bug fix - [ ] Feature - [x] Refactor - [ ] Docs - [ ] Security hardening - [ ] Chore/infra ## Scope (select all touched areas) - [x] Gateway / orchestration - [ ] Skills / tool execution - [ ] Auth / tokens - [ ] Memory / storage - [ ] Integrations - [ ] API / contracts - [ ] UI / DX - [ ] CI/CD / infra ## Linked Issue/PR - Related #19608 ## User-visible / Behavior Changes None for single-provider deployments. For multi-provider deployments with overlapping model IDs: context window sizing will now correctly use the per-provider configured value rather than whichever provider happened to write to the cache last. ## Security Impact (required) - New permissions/capabilities? `No` - Secrets/tokens handling changed? `No` - New/changed network calls? `No` - Command/tool execution surface changed? `No` - Data access scope changed? `No` ## Repro + Verification ### Environment - OS: All (logic is platform-independent) - Runtime/container: Node 22+ - Model/provider: Any deployment with multiple providers sharing a model ID (e.g. custom provider + Anthropic both exposing `claude-sonnet-4-5`) - Integration/channel: Any - Relevant config (redacted): `models.json` with a provider entry overriding `contextWindow` for a model also discoverable from another provider ### Steps 1. Configure a custom provider in `models.json` with `claude-sonnet-4-5` and `contextWindow: 65536` 2. Also have native Anthropic provider enabled (discovers the same model at 200k) 3. Start an agent session on the custom provider 4. Observe context window used for budget calculations (previously: 200k or 65k depending on load order; now: 65k as configured) ### Expected - Custom provider's 64k limit is used for sessions on that provider - Anthropic's 200k limit is used for sessions on Anthropic ### Actual (before fix) - Cache collision caused one provider's limit to overwrite the other's depending on load order ## Evidence ### Config "models": { "mode": "merge", "providers": { "jabberwocky-anthropic": { "baseUrl": "https://api.anthropic.com", "apiKey": "", "api": "anthropic-messages", "models": [ { "id": "claude-sonnet-4-5", "name": "Claude Sonnet 4.5 (96k limit)", "reasoning": true, "input": [ "text", "image" ], "cost": { "input": 3, "output": 15, "cacheRead": 0.3, "cacheWrite": 3.75 }, "contextWindow": 96000, "maxTokens": 8192 } ] }, "main-anthropic": { "baseUrl": "https://api.anthropic.com", "apiKey": "", "api": "anthropic-messages", "models": [ { "id": "claude-sonnet-4-5", "name": "Claude Sonnet 4.5 (200k)", "reasoning": true, "input": [ "text", "image" ], "cost": { "input": 3, "output": 15, "cacheRead": 0.3, "cacheWrite": 3.75 }, "contextWindow": 200000, "maxTokens": 8192 } ] }, "butterfly-anthropic": { "baseUrl": "https://api.anthropic.com", "apiKey": "", "api": "anthropic-messages", "models": [ { "id": "claude-haiku-4-5", "name": "Claude Haiku 4.5 (64k limit)", "reasoning": false, "input": [ "text", "image" ], "cost": { "input": 0.8, "output": 4, "cacheRead": 0.08, "cacheWrite": 1 }, "contextWindow": 64000, "maxTokens": 8192 } ] } } } ### Gateway status Before - agent:main:matrix:channel:!uqnevfivnomrmhtcsp:example.com [group] | 39m ago | model claude-sonnet-4-5 | tokens 69k/200k (131k left, 35%) | flags: system, id:31f9dda9-35f6-49e5-b21b-adcf0faffff7 - agent:jabberwocky:matrix:channel:!uqnevfivnomrmhtcsp:example.com [group] | 2h ago | model claude-sonnet-4-5 | tokens 12k/200k (188k left, 6%) | flags: system, id:36b50386-2cb9-493d-a3da-b72a2ef36c55 - agent:butterfly:matrix:channel:!uqnevfivnomrmhtcsp:example.com [group] | 2h ago | model claude-haiku-4-5 | tokens 16k/200k (184k left, 8%) | flags: system, id:c443e762-f5ca-48f4-8292-78483de08f78 ### Gateway status After - agent:main:matrix:channel:!uqnevfivnomrmhtcsp:example.com [group] | 58m ago | model claude-sonnet-4-5 | tokens 69k/200k (131k left, 35%) | flags: system, id:31f9dda9-35f6-49e5-b21b-adcf0faffff7 - agent:jabberwocky:matrix:channel:!uqnevfivnomrmhtcsp:example.com [group] | 2h ago | model claude-sonnet-4-5 | tokens 12k/96k (84k left, 12%) | flags: system, id:36b50386-2cb9-493d-a3da-b72a2ef36c55 - agent:butterfly:matrix:channel:!uqnevfivnomrmhtcsp:example.com [group] | 2h ago | model claude-haiku-4-5 | tokens 16k/64k (48k left, 25%) | flags: system, id:c443e762-f5ca-48f4-8292-78483de08f78 ## Human Verification (required) - Verified scenarios: Function calls are working properly from unit tests. - ContextWindow size is correctly reported by `status` commands and used correctly for compaction. - *Note*: There is a separate issue with the tag in the agent's system prompt that lets it know its context window size. This probably isn't a big deal, and it seems that it's not something in the OpenClaw code, but rather upstream, possibly specific to the providers. - *Also Note*: There is another issue that causes the model name displayed in the TUI to be incorrect. I have not done any troubleshooting on that issue, but it's not directly related to this fix. ## Compatibility / Migration - Backward compatible? `Yes` - *Note*: Change does require a reset of the agent's session and a hard reboot to take effect. - Config/env changes? `No` - Migration needed? `No` ## Failure Recovery (if this breaks) - How to disable/revert this change quickly: Revert commits `560c3fc8`, `a39eaf1f`, `9fa5d93a`, `4eae006d`, `5d0a3961` (or revert the PR merge commit) - Known bad symptoms to watch for: Context window unexpectedly reverting to `DEFAULT_CONTEXT_TOKENS` for a model that was previously being resolved correctly (would indicate a provider string is not being passed through at a call site) ### Files/config to restore: **Core:** - context.ts - context.test.ts **Auto-reply (6 files):** - agent-runner.ts - agent-runner-memory.ts - directive-handling.persist.ts - followup-runner.ts - get-reply-directives.ts - memory-flush.ts - model-selection.ts - status.ts **Commands (3 files):** - session-store.ts - sessions.ts - status.summary.ts **Gateway/Cron (2 files):** - session-utils.ts - run.ts ## Risks and Mitigations - Risk: A call site that uses `lookupContextTokens` from agents/context.ts was not updated to pass the updated params argument: ```typescript // Old: export function lookupContextTokens(modelId: string | undefined): number | undefined // New: export function lookupContextTokens(params: { provider?: string; modelId?: string; }): number | undefined ``` I have made a concerted effort to locate all call sites and update them. In testing on my live gateway server I saw no errors, so I think I got them all. <!-- greptile_comment --> <h3>Greptile Summary</h3> Changed context window cache keying from bare model IDs to `provider/modelId` format, preventing cache collisions when multiple providers expose the same model ID with different context window limits. Threaded provider parameter through 15 files to all `lookupContextTokens` call sites. Added two-tier lookup (qualified key first, bare fallback) to maintain compatibility with discovered models. Tests verify the separation logic and fallback behavior. <h3>Confidence Score: 5/5</h3> - This PR is safe to merge with minimal risk - Well-structured refactor that solves a real collision bug. All 15 call sites properly updated to new signature. Comprehensive test coverage for two-tier lookup and provider separation. Backward compatible via bare model-id fallback. No breaking API changes. - No files require special attention <sub>Last reviewed commit: fbbb2b7</sub> <!-- greptile_other_comments_section --> <sub>(3/5) Reply to the agent's comments like "Can you suggest a fix for this @greptileai?" or ask follow-up questions!</sub> <!-- /greptile_comment -->

Most Similar PRs