#12487: fix(agents): strip orphaned tool_result when tool_use is sanitized on retry

by skylarkoo7 open 2026-02-09 09:14 View on GitHub →

agents stale

## Summary When the Anthropic API rejects a request with a "Cloud Code Assist format error", the retry path calls `sanitizeToolCallInputs()` to strip malformed `tool_use` blocks (those missing `input`/`arguments`). However, the corresponding `tool_result` messages — which reference the now-removed `tool_use` IDs — were left in the transcript. Every subsequent API call is then permanently rejected with: ``` unexpected tool_use_id found in tool_result blocks: <id>. Each tool_result block must have a corresponding tool_use block in the previous message. ``` The only recovery was `/new` or `/reset`. ## Changes - **`src/agents/session-transcript-repair.ts`** — `repairToolCallInputs()` now tracks the IDs of stripped `tool_use` blocks and runs a second pass to remove any `tool_result` messages that reference those IDs - **`src/agents/pi-embedded-runner/google.ts`** — `sanitizeSessionHistory()` now always calls `sanitizeToolUseResultPairing()` after `repairToolCallInputs()` when tool calls were dropped, even if the provider policy does not normally require pairing repair. This is a safety net to catch any edge cases the first pass might miss. - **`src/agents/session-transcript-repair.test.ts`** — Two new test cases: - Verifies that orphaned `tool_result` messages are removed when their `tool_use` is stripped (partial strip) - Verifies that all `tool_result` messages are removed when the entire assistant message is dropped (full strip) ## Test plan - All 8 existing + new tests in `session-transcript-repair.test.ts` pass - After a Cloud Code Assist format error, retries no longer accumulate orphaned `tool_result` blocks - Sessions no longer become permanently corrupted after format error retries - Normal tool_use/tool_result pairing is unaffected (existing tests verify this) Fixes #12392  <h2>Greptile Overview</h2> <h3>Greptile Summary</h3> This PR fixes transcript corruption on retry after a tool-call format error by ensuring that when malformed `tool_use` / `toolCall` blocks are stripped, any corresponding `tool_result` messages that reference the removed IDs are also removed. Concretely, `repairToolCallInputs()` now tracks stripped tool-call IDs and runs a second pass to drop matching `toolResult` entries, preventing strict providers from rejecting subsequent requests due to “unexpected tool_use_id”. It also strengthens the Google embedded runner’s session sanitization by running tool-use/tool-result pairing repair whenever any tool calls were dropped, even if the provider policy wouldn’t normally require pairing repair, acting as a safety net for edge cases. <h3>Confidence Score: 5/5</h3> - This PR is safe to merge with minimal risk. - Changes are narrowly scoped to transcript repair/sanitization logic, add targeted tests for the reported corruption scenario, and do not introduce new behavioral complexity beyond removing invalid tool_result entries that would otherwise make strict providers reject requests. - No files require special attention