#19407: fix(agents): strip thinking blocks on cross-provider model switch (#19295)

by lailoo open 2026-02-17 19:26 View on GitHub →

agents size: M experienced-contributor

## Summary - **Bug**: Anthropic thinking blocks survive session history sanitization when failing over to a different model provider, causing 400 errors. - **Root cause**: `sanitizeSessionHistory` in `src/agents/pi-embedded-runner/google.ts` had `downgradeOpenAIReasoningBlocks` for OpenAI reasoning blocks but no equivalent stripping for Anthropic `type: "thinking"` blocks on model change. - **Fix**: Add `stripNonNativeThinkingBlocks` that removes all `type: "thinking"` blocks from assistant messages, wired into the sanitization pipeline when `modelChanged` is true. Fixes #19295 ## Problem When the primary model (e.g. Anthropic Claude) rate-limits and the system fails over to a fallback provider (e.g. OpenAI), Anthropic-specific `type: "thinking"` blocks remain in the session history. The fallback provider rejects these with 400 errors because it doesn't understand unsigned thinking blocks. The existing `downgradeOpenAIReasoningBlocks` only handles OpenAI `rs_`-signed reasoning blocks. There was no equivalent for Anthropic thinking blocks. **Before fix:** ``` Input: Session history with Anthropic thinking blocks + model switch to OpenAI Output: 400 error — thinking blocks sent to OpenAI as-is ``` ## Changes - `src/agents/pi-embedded-helpers/openai.ts` — Add `stripNonNativeThinkingBlocks` function that strips all `type: "thinking"` blocks from assistant messages, dropping empty assistant messages entirely. - `src/agents/pi-embedded-helpers.ts` — Export the new function from the barrel. - `src/agents/pi-embedded-runner/google.ts` — Wire `stripNonNativeThinkingBlocks` into `sanitizeSessionHistory` when `modelChanged` is true, after the existing OpenAI reasoning block downgrade. **After fix:** ``` Input: Session history with Anthropic thinking blocks + model switch to OpenAI Output: Thinking blocks stripped, only text/tool_use blocks remain — no 400 error ``` ## Test plan - [x] New tests: 5 unit tests for `stripNonNativeThinkingBlocks` (strip, drop empty, preserve non-assistant, preserve no-thinking, preserve string content) - [x] New tests: 3 integration tests for `sanitizeSessionHistory` (Anthropic→OpenAI strip, round-trip Anthropic→OpenAI→Anthropic strip, same-model preserve) - [x] All 8 regression tests pass - [x] Verified with real Anthropic API (haiku with extended thinking) — thinking blocks produced and correctly stripped on model switch - [x] Lint passes (0 errors) ## Effect on User Experience **Before:** When the primary Anthropic model rate-limits and the system fails over to OpenAI (or another provider), users see 400 errors and the agent session breaks because stale thinking blocks are rejected by the fallback provider. **After:** Model failover works seamlessly — thinking blocks from the previous provider are automatically stripped from session history before sending to the new provider.  <h3>Greptile Summary</h3> This PR fixes a bug (#19295) where Anthropic `type: "thinking"` blocks survived session history sanitization during model provider failover (e.g. Anthropic → OpenAI), causing 400 errors from the fallback provider. The fix adds a `stripNonNativeThinkingBlocks` helper that removes all `type: "thinking"` blocks from assistant messages, wired into `sanitizeSessionHistory` in `google.ts` when `modelChanged` is true — complementing the existing `downgradeOpenAIReasoningBlocks` (which handles a different but related OpenAI Responses API edge case on same-provider reuse). - The core fix logic in `google.ts` is correct: both return paths are updated to use `sanitizedThinkingCrossProvider`, and the guard condition (`modelChanged`) is appropriate. - The new function is correctly placed after `downgradeOpenAIReasoningBlocks` in the pipeline; since `downgradeOpenAIReasoningBlocks` only runs for `openai-responses` / `openai-codex-responses` APIs, and `stripNonNativeThinkingBlocks` runs for any cross-provider switch, they handle non-overlapping cases cleanly. - Test coverage is solid for the stated scenarios. The round-trip test (Anthropic → OpenAI → Anthropic) uses the same static `history` object on every call rather than the progressively built history, but this still exercises the `modelChanged` guard correctly. - One style issue: the docstring for `stripNonNativeThinkingBlocks` claims it "strips blocks that do NOT carry an OpenAI reasoning signature" (implying `rs_`-signed blocks are preserved), but the implementation strips **all** `type: "thinking"` blocks unconditionally — there is no signature check. The function name also suggests selective behavior ("NonNative") that isn't implemented. The behavior is correct given the call-site guard (`modelChanged`), but the docstring is misleading for future readers. <h3>Confidence Score: 4/5</h3> - This PR is safe to merge; the fix correctly addresses the 400-error regression during cross-provider failover with minimal blast radius. - The logic is correct and well-tested. The only issue is a misleading docstring and function name that don't reflect the actual (unconditional) stripping behavior, which could cause confusion for future contributors but doesn't affect runtime correctness. The call-site guard (`modelChanged`) ensures the function is only applied in the appropriate cross-provider context. - src/agents/pi-embedded-helpers/openai.ts — the docstring for `stripNonNativeThinkingBlocks` should be updated to accurately describe the unconditional stripping behavior. <sub>Last reviewed commit: 6793915</sub>