#14744: fix(context): key MODEL_CACHE by provider/modelId to prevent collision (#14708)

by lailoo open 2026-02-12 15:35 View on GitHub →

gateway commands agents stale size: S trusted-contributor

## Summary - **Bug**: `MODEL_CACHE` in `context.ts` keys entries by bare model ID only. When two providers register the same model ID (e.g. `claude-opus-4-6` on both `anthropic` and a proxy), the last-loaded provider silently overwrites the first, returning the wrong context window. - **Root cause**: `MODEL_CACHE.set(m.id, m.contextWindow)` uses `m.id` as key without provider dimension. `lookupContextTokens(modelId)` has no `provider` parameter, so callers cannot disambiguate. - **Fix**: Key cache entries by `provider/modelId`, add optional `provider` parameter to `lookupContextTokens`, and pass provider from all 12 call sites where available. Fixes #14708 ## Problem `context.ts` builds a `MODEL_CACHE` map keyed by bare model ID: ```typescript MODEL_CACHE.set(m.id, m.contextWindow); // no provider dimension ``` When `discoverModels().getAll()` returns models with the same ID from different providers (e.g. `anthropic/claude-opus-4-6` with 200k and `my-proxy/claude-opus-4-6` with 128k), the second write silently overwrites the first. `lookupContextTokens` only accepts `modelId`, so callers have no way to get the correct value for a specific provider. This affects session context management (`agent-runner.ts`, `followup-runner.ts`, `cron/run.ts`, etc.) — wrong context window leads to incorrect compaction thresholds and token budgets. **Before fix (reproduced on main via integration test):** ``` pnpm vitest run src/agents/context.test.ts ✓ lookupContextTokens returns last-writer-wins for same model ID across providers → lookupContextTokens("claude-opus-4-6") = 128_000 (my-proxy's value) → anthropic's 200_000 is lost ✓ lookupContextTokens has no provider parameter (API limitation) → function.length ≤ 1 Test Files 1 passed (1) Tests 2 passed (2) ``` ## Changes - `src/agents/context.ts` — Key `MODEL_CACHE` by `provider/modelId`; add bare model ID as first-writer-wins fallback; add optional `provider` param to `lookupContextTokens` - `src/auto-reply/reply/agent-runner.ts` — Pass `providerUsed` to `lookupContextTokens` - `src/auto-reply/reply/agent-runner-memory.ts` — Pass `provider` to `resolveMemoryFlushContextWindowTokens` - `src/auto-reply/reply/directive-handling.persist.ts` — Pass `provider` to `lookupContextTokens` - `src/auto-reply/reply/followup-runner.ts` — Pass `provider` to `lookupContextTokens` - `src/auto-reply/reply/memory-flush.ts` — Add optional `provider` param to `resolveMemoryFlushContextWindowTokens` - `src/auto-reply/reply/model-selection.ts` — Add optional `provider` param to `resolveContextTokens` - `src/auto-reply/reply/get-reply-directives.ts` — Pass `provider` to `resolveContextTokens` - `src/auto-reply/status.ts` — Pass `provider` to both `lookupContextTokens` calls - `src/commands/sessions.ts` — Pass `resolved.provider` where available - `src/commands/status.summary.ts` — Pass `resolved.provider` where available - `src/commands/agent/session-store.ts` — Pass `providerUsed` to `lookupContextTokens` - `src/cron/isolated-agent/run.ts` — Pass `providerUsed` to `lookupContextTokens` - `src/gateway/session-utils.ts` — Pass `resolved.provider` to `lookupContextTokens` - `src/agents/context.test.ts` — New regression tests - `CHANGELOG.md` — Add fix entry **After fix (verified on fix branch):** ``` pnpm vitest run src/agents/context.test.ts ✓ returns correct context window per provider for same model ID → lookupContextTokens("claude-opus-4-6", "anthropic") = 200_000 → lookupContextTokens("claude-opus-4-6", "my-proxy") = 128_000 ✓ bare model ID fallback uses first-writer-wins → lookupContextTokens("claude-opus-4-6") = 200_000 (first provider wins) ✓ lookupContextTokens accepts optional provider parameter Test Files 1 passed (1) Tests 3 passed (3) ``` ## Design decisions - **Backward compatible**: `provider` param is optional. Callers without provider info still get a result via bare model ID fallback (first-writer-wins). - **First-writer-wins for bare ID**: When no provider is given, the first provider's value is used. This is better than last-writer-wins because it's deterministic and preserves the "primary" provider's value. - **Not all call sites have provider**: `sessions.ts` row iteration and `status.summary.ts` session rows don't have provider info — these fall back to bare model ID lookup, which is still better than the old behavior. ## Test plan - [x] Bug reproduced on main: last-writer-wins collision, no provider param (integration test) - [x] Fix verified: provider-qualified lookups return correct values per provider - [x] Fix verified: bare model ID fallback uses first-writer-wins - [x] Fix verified: backward compatible (calling without provider still works) - [x] All 10 existing memory-flush tests pass - [x] TypeScript type check passes (`pnpm tsgo` — no errors in changed files) - [x] Lint passes (`pnpm lint`) ## Effect on User Experience **Before:** When two providers register the same model ID with different context windows, the wrong context window is silently used. This causes incorrect compaction thresholds, potentially truncating conversation history too aggressively or not aggressively enough. **After:** Each provider's context window is correctly preserved. Callers with provider info get the exact value; callers without provider info get a deterministic first-writer-wins fallback.  <h2>Greptile Overview</h2> <h3>Greptile Summary</h3> This change fixes collisions in `MODEL_CACHE` by qualifying cache keys with `provider/modelId` and extends `lookupContextTokens` to accept an optional `provider` so callers can request the correct context window when different providers expose the same model ID. Call sites across auto-reply/session/cron/gateway paths are updated to pass provider when available, and a new regression test is added to cover provider-qualified lookups and the bare-ID fallback behavior. <h3>Confidence Score: 4/5</h3> - This PR is mostly safe to merge, but the new regression test is likely to be flaky due to timing-based initialization. - Core change is localized (cache keying + optional provider parameter) and call sites were updated consistently. The main concern is `src/agents/context.test.ts` using a fixed 50ms sleep to wait for async module initialization, which can fail nondeterministically in CI and block merges. - src/agents/context.test.ts