#19655: Fix/context window provider keying
gateway
commands
agents
size: S
Cluster:
Model Configuration Fixes
## Summary
- **Problem:** The model context-window cache used bare model IDs as keys, so the same model ID from multiple providers (e.g. `claude-sonnet-4-5` via both Anthropic and a custom provider) would collide, causing one provider's configured limit to silently overwrite another's.
- **Why it matters:** Incorrect token budgets cause premature context compaction (too-small) or context overflow errors (too-large), with no warning surfaced to the user. Affected any deployment using custom providers or OpenRouter alongside native providers with overlapping model IDs.
- **What changed:** Cache keys are now `provider/modelId` (two-tier lookup: qualified first, bare fallback). Provider is threaded through to all `lookupContextTokens` call sites. A `_lookupFromCache` pure helper was extracted for testability. Tests were updated to match new key format and cover the two-tier logic.
- **What did NOT change:** The fallback to bare model-ID keys is preserved for callers without provider context. `DEFAULT_CONTEXT_TOKENS` fallback behavior is unchanged. No config schema changes, no API surface changes, no behavior change for single-provider deployments.
## Change Type (select all)
- [x] Bug fix
- [ ] Feature
- [x] Refactor
- [ ] Docs
- [ ] Security hardening
- [ ] Chore/infra
## Scope (select all touched areas)
- [x] Gateway / orchestration
- [ ] Skills / tool execution
- [ ] Auth / tokens
- [ ] Memory / storage
- [ ] Integrations
- [ ] API / contracts
- [ ] UI / DX
- [ ] CI/CD / infra
## Linked Issue/PR
- Related #19608
## User-visible / Behavior Changes
None for single-provider deployments. For multi-provider deployments with overlapping model IDs: context window sizing will now correctly use the per-provider configured value rather than whichever provider happened to write to the cache last.
## Security Impact (required)
- New permissions/capabilities? `No`
- Secrets/tokens handling changed? `No`
- New/changed network calls? `No`
- Command/tool execution surface changed? `No`
- Data access scope changed? `No`
## Repro + Verification
### Environment
- OS: All (logic is platform-independent)
- Runtime/container: Node 22+
- Model/provider: Any deployment with multiple providers sharing a model ID (e.g. custom provider + Anthropic both exposing `claude-sonnet-4-5`)
- Integration/channel: Any
- Relevant config (redacted): `models.json` with a provider entry overriding `contextWindow` for a model also discoverable from another provider
### Steps
1. Configure a custom provider in `models.json` with `claude-sonnet-4-5` and `contextWindow: 65536`
2. Also have native Anthropic provider enabled (discovers the same model at 200k)
3. Start an agent session on the custom provider
4. Observe context window used for budget calculations (previously: 200k or 65k depending on load order; now: 65k as configured)
### Expected
- Custom provider's 64k limit is used for sessions on that provider
- Anthropic's 200k limit is used for sessions on Anthropic
### Actual (before fix)
- Cache collision caused one provider's limit to overwrite the other's depending on load order
## Evidence
### Config
"models": {
"mode": "merge",
"providers": {
"jabberwocky-anthropic": {
"baseUrl": "https://api.anthropic.com",
"apiKey": "",
"api": "anthropic-messages",
"models": [
{
"id": "claude-sonnet-4-5",
"name": "Claude Sonnet 4.5 (96k limit)",
"reasoning": true,
"input": [
"text",
"image"
],
"cost": {
"input": 3,
"output": 15,
"cacheRead": 0.3,
"cacheWrite": 3.75
},
"contextWindow": 96000,
"maxTokens": 8192
}
]
},
"main-anthropic": {
"baseUrl": "https://api.anthropic.com",
"apiKey": "",
"api": "anthropic-messages",
"models": [
{
"id": "claude-sonnet-4-5",
"name": "Claude Sonnet 4.5 (200k)",
"reasoning": true,
"input": [
"text",
"image"
],
"cost": {
"input": 3,
"output": 15,
"cacheRead": 0.3,
"cacheWrite": 3.75
},
"contextWindow": 200000,
"maxTokens": 8192
}
]
},
"butterfly-anthropic": {
"baseUrl": "https://api.anthropic.com",
"apiKey": "",
"api": "anthropic-messages",
"models": [
{
"id": "claude-haiku-4-5",
"name": "Claude Haiku 4.5 (64k limit)",
"reasoning": false,
"input": [
"text",
"image"
],
"cost": {
"input": 0.8,
"output": 4,
"cacheRead": 0.08,
"cacheWrite": 1
},
"contextWindow": 64000,
"maxTokens": 8192
}
]
}
}
}
### Gateway status Before
- agent:main:matrix:channel:!uqnevfivnomrmhtcsp:example.com [group] | 39m ago | model claude-sonnet-4-5 | tokens 69k/200k (131k left, 35%) | flags: system, id:31f9dda9-35f6-49e5-b21b-adcf0faffff7
- agent:jabberwocky:matrix:channel:!uqnevfivnomrmhtcsp:example.com [group] | 2h ago | model claude-sonnet-4-5 | tokens 12k/200k (188k left, 6%) | flags: system, id:36b50386-2cb9-493d-a3da-b72a2ef36c55
- agent:butterfly:matrix:channel:!uqnevfivnomrmhtcsp:example.com [group] | 2h ago | model claude-haiku-4-5 | tokens 16k/200k (184k left, 8%) | flags: system, id:c443e762-f5ca-48f4-8292-78483de08f78
### Gateway status After
- agent:main:matrix:channel:!uqnevfivnomrmhtcsp:example.com [group] | 58m ago | model claude-sonnet-4-5 | tokens 69k/200k (131k left, 35%) | flags: system, id:31f9dda9-35f6-49e5-b21b-adcf0faffff7
- agent:jabberwocky:matrix:channel:!uqnevfivnomrmhtcsp:example.com [group] | 2h ago | model claude-sonnet-4-5 | tokens 12k/96k (84k left, 12%) | flags: system, id:36b50386-2cb9-493d-a3da-b72a2ef36c55
- agent:butterfly:matrix:channel:!uqnevfivnomrmhtcsp:example.com [group] | 2h ago | model claude-haiku-4-5 | tokens 16k/64k (48k left, 25%) | flags: system, id:c443e762-f5ca-48f4-8292-78483de08f78
## Human Verification (required)
- Verified scenarios: Function calls are working properly from unit tests.
- ContextWindow size is correctly reported by `status` commands and used correctly for compaction.
- *Note*: There is a separate issue with the tag in the agent's system prompt that lets it know its context window size. This probably isn't a big deal, and it seems that it's not something in the OpenClaw code, but rather upstream, possibly specific to the providers.
- *Also Note*: There is another issue that causes the model name displayed in the TUI to be incorrect. I have not done any troubleshooting on that issue, but it's not directly related to this fix.
## Compatibility / Migration
- Backward compatible? `Yes`
- *Note*: Change does require a reset of the agent's session and a hard reboot to take effect.
- Config/env changes? `No`
- Migration needed? `No`
## Failure Recovery (if this breaks)
- How to disable/revert this change quickly: Revert commits `560c3fc8`, `a39eaf1f`, `9fa5d93a`, `4eae006d`, `5d0a3961` (or revert the PR merge commit)
- Known bad symptoms to watch for: Context window unexpectedly reverting to `DEFAULT_CONTEXT_TOKENS` for a model that was previously being resolved correctly (would indicate a provider string is not being passed through at a call site)
### Files/config to restore:
**Core:**
- context.ts
- context.test.ts
**Auto-reply (6 files):**
- agent-runner.ts
- agent-runner-memory.ts
- directive-handling.persist.ts
- followup-runner.ts
- get-reply-directives.ts
- memory-flush.ts
- model-selection.ts
- status.ts
**Commands (3 files):**
- session-store.ts
- sessions.ts
- status.summary.ts
**Gateway/Cron (2 files):**
- session-utils.ts
- run.ts
## Risks and Mitigations
- Risk: A call site that uses `lookupContextTokens` from agents/context.ts was not updated to pass the updated params argument:
```typescript
// Old:
export function lookupContextTokens(modelId: string | undefined): number | undefined
// New:
export function lookupContextTokens(params: {
provider?: string;
modelId?: string;
}): number | undefined
```
I have made a concerted effort to locate all call sites and update them. In testing on my live gateway server I saw no errors, so I think I got them all.
<!-- greptile_comment -->
<h3>Greptile Summary</h3>
Changed context window cache keying from bare model IDs to `provider/modelId` format, preventing cache collisions when multiple providers expose the same model ID with different context window limits. Threaded provider parameter through 15 files to all `lookupContextTokens` call sites. Added two-tier lookup (qualified key first, bare fallback) to maintain compatibility with discovered models. Tests verify the separation logic and fallback behavior.
<h3>Confidence Score: 5/5</h3>
- This PR is safe to merge with minimal risk
- Well-structured refactor that solves a real collision bug. All 15 call sites properly updated to new signature. Comprehensive test coverage for two-tier lookup and provider separation. Backward compatible via bare model-id fallback. No breaking API changes.
- No files require special attention
<sub>Last reviewed commit: fbbb2b7</sub>
<!-- greptile_other_comments_section -->
<sub>(3/5) Reply to the agent's comments like "Can you suggest a fix for this @greptileai?" or ask follow-up questions!</sub>
<!-- /greptile_comment -->
Most Similar PRs
#15632: fix: use provider-qualified key in MODEL_CACHE for context window l...
by linwebs · 2026-02-13
83.4%
#14744: fix(context): key MODEL_CACHE by provider/modelId to prevent collis...
by lailoo · 2026-02-12
81.6%
#17604: fix(context): use getAvailable() to prevent cross-provider model ID...
by aldoeliacim · 2026-02-16
78.1%
#17414: fix(sessions): refresh contextTokens when model override changes
by michaelbship · 2026-02-15
75.1%
#23136: fix: lookupContextTokens should handle provider/model refs
by patchguardio · 2026-02-22
74.5%
#20962: Fix/context window size for custom api provider
by r4jiv007 · 2026-02-19
74.5%
#23299: fix(status): show runtime model context limit instead of stale sess...
by SidQin-cyber · 2026-02-22
74.1%
#18886: fix(status): prefer configured contextTokens over model metadata
by BinHPdev · 2026-02-17
73.5%
#23816: fix(agents): model fallback skipped during session overrides and pr...
by ramezgaberiel · 2026-02-22
73.3%
#17015: fix: correct Claude 4.5 context limits in model registry
by Limitless2023 · 2026-02-15
73.3%