← Back to PRs

#14328: fix: strip incomplete tool_use blocks from errored/aborted messages to prevent permanent 400 loops

by Kropiunig open 2026-02-12 00:01 View on GitHub →
agents size: S
#### Summary Fixes a critical session-poisoning bug where an interrupted streaming response permanently breaks the session with a 400 error loop. When a tool call is interrupted mid-stream (network error, timeout, user abort), the assistant message contains incomplete `tool_use` blocks (with `partialJson: true`). The previous fix (#4597) correctly avoided creating synthetic `tool_result` entries for these, but still left the malformed `tool_use` blocks in the transcript. On every subsequent API call, Anthropic rejects the request: ``` 400 messages.244.content.1: unexpected tool_use_id found in tool_result blocks: toolu_01PfLjsziXFMs7pAQCtBLn1f ``` The error is baked into the session history, so every message hits the same 400. Only `/new` or `/reset` recovers. lobster-biscuit #### Repro Steps 1. Start a long session with many tool calls 2. Have a tool call interrupted mid-stream (network issue, timeout, abort) 3. OpenClaw persists the assistant message with `stopReason: "error"` and incomplete `tool_use` blocks 4. Send any new message -> permanent 400 loop #### Root Cause Three interconnected issues: 1. **`repairToolUseResultPairing()` passed errored messages through unchanged** (line 224-227 before this fix) — the incomplete `tool_use` blocks remained in the transcript with no matching `tool_result`, causing the 400 2. **`hasToolCallInput()` didn't detect `partialJson: true`** — blocks flagged as partial (interrupted mid-stream) with an empty `input: {}` passed the completeness check 3. **No user-friendly error** for the specific 400 pattern "unexpected tool_use_id found in tool_result blocks" #### Behavior Changes - **Errored/aborted assistant messages**: `tool_use`/`toolCall`/`functionCall` blocks are now stripped from the content array. Text and thinking blocks are preserved (partial reasoning is still valuable). If no content remains after stripping, the entire message is dropped. - **`partialJson: true` tool calls**: Now detected as incomplete by `hasToolCallInput()` and dropped during the `sanitizeToolCallInputs` pass, even when `input` is present. - **User-facing error**: If the 400 still reaches the user (e.g. from an older session file), `formatAssistantErrorText()` now returns a clear message instead of raw JSON. #### Codebase and GitHub Search - Searched for `partialJson`, `unexpected tool_use_id`, `stopReason.*error`, `tool_result blocks` across the codebase - Found the existing partial fix from #4597 and understood why it was insufficient - Verified `sanitizeToolCallInputs` runs before `repairToolUseResultPairing` in the sanitization pipeline (`google.ts:352-354`) - Confirmed compaction already calls `repairToolUseResultPairing` after dropping chunks (`compaction.ts:343`) #### Tests All existing tests updated + 3 new tests added: - `strips tool_use blocks from errored assistant messages to prevent 400 loops` — verifies tool-only errored message is dropped - `strips tool_use blocks from aborted assistant messages to prevent 400 loops` — same for aborted - `preserves text content from errored assistant messages while stripping tool_use` — verifies text/thinking blocks survive - `drops tool calls with partialJson: true even when input is present` — verifies the `hasToolCallInput()` fix - `full scenario: interrupted stream does not poison session permanently` — end-to-end test reproducing the exact #14322 scenario through both sanitization passes ``` pnpm vitest run src/agents/session-transcript-repair.test.ts # 13/13 pass pnpm vitest run src/agents/session-tool-result-guard.test.ts # 10/10 pass pnpm vitest run src/agents/pi-embedded-runner.sanitize-session-history.test.ts # 9/9 pass pnpm vitest run src/agents/compaction.test.ts # 10/10 pass pnpm vitest run src/agents/pi-embedded-helpers.formatassistanterrortext.test.ts # 10/10 pass pnpm check # format + typecheck + lint all pass pnpm build # clean build ``` **Sign-Off** - Models used: Claude Opus 4.6 - Submitter effort: Deep codebase analysis, traced the full session history -> context assembly -> API call pipeline, identified three interconnected root causes, implemented multi-layer defense - Agent notes: AI-assisted PR. The fix is surgical — 3 files, ~160 lines added, all focused on the bug. No unrelated changes. <!-- greptile_comment --> <h2>Greptile Overview</h2> <h3>Greptile Summary</h3> Fixes a session-poisoning bug where interrupted streaming responses (network errors, timeouts, user aborts) permanently break sessions with 400 error loops. The root cause: incomplete `tool_use` blocks with `partialJson: true` were left in the transcript, causing every subsequent API call to be rejected by Anthropic. - **`hasToolCallInput()`** now detects `partialJson: true` blocks as incomplete, so `sanitizeToolCallInputs` drops them during the first sanitization pass - **`repairToolUseResultPairing()`** now strips tool call blocks from errored/aborted assistant messages instead of passing them through unchanged; text and thinking content is preserved - **`formatAssistantErrorText()`** adds a user-friendly error message for the specific "unexpected tool_use_id found in tool_result blocks" 400 pattern - Tests updated and expanded with 3 new test cases plus an end-to-end scenario reproducing the exact issue <h3>Confidence Score: 4/5</h3> - This PR is safe to merge — it fixes a critical session-poisoning bug with well-scoped, defensive changes and thorough test coverage. - The fix is surgical and well-reasoned: 3 files changed with ~160 lines added, all directly targeting the bug. The multi-layer defense (sanitizeToolCallInputs catches partialJson, repairToolUseResultPairing strips remaining tool blocks from errored messages, formatAssistantErrorText provides a fallback user message) is appropriate. Tests cover the key scenarios including an end-to-end reproduction. The only minor gap is the lack of a direct unit test for isCorruptedToolUsePairingError and its integration into formatAssistantErrorText, though the existing test suite for that function passes. - No files require special attention. All changes are focused and correct. <!-- greptile_other_comments_section --> <sub>(2/5) Greptile learns from your feedback when you react with thumbs up/down!</sub> <!-- /greptile_comment -->

Most Similar PRs