#3362: fix: auto-repair and retry on orphan tool_result errors
agents
Cluster:
Error Handling in Agent Tools
## Problem
When the Anthropic API rejects a request with:
```
LLM request rejected: messages.60.content.1: unexpected tool_use_id found in tool_result blocks: toolu_01KTTwhMaCYW8oiwZMxx6WDt. Each tool_result block must have a corresponding tool_use block in the previous message.
```
This indicates corrupted session history where a `tool_result` references a `tool_use` that doesn't exist in the previous message.
### Root causes
- Race conditions in message processing
- Partial saves during crashes
- History truncation that removes `tool_use` but keeps `tool_result`
## Solution
1. **Add `isOrphanToolResultError()`** - Detects this specific error pattern
2. **Add retry logic in `runEmbeddedAttempt`** - When error is caught:
- Repair transcript using existing `repairToolUseResultPairing()`
- Log repair details (orphans dropped, duplicates dropped, synthetic results added)
- Retry once before failing
This allows sessions to self-recover from corrupted history without requiring a manual session reset.
## Changes
- `src/agents/pi-embedded-helpers/errors.ts` - Added `isOrphanToolResultError()`
- `src/agents/pi-embedded-helpers.ts` - Exported new function
- `src/agents/pi-embedded-runner/run/attempt.ts` - Added retry logic
- `src/agents/pi-embedded-helpers.isorphantoolresulterror.test.ts` - Test coverage
<!-- greptile_comment -->
<h2>Greptile Overview</h2>
<h3>Greptile Summary</h3>
This PR adds detection and self-healing for a specific Anthropic invalid-request failure where a `tool_result` references a `tool_use_id` not present in the previous message (corrupted transcript). It introduces `isOrphanToolResultError()` in `src/agents/pi-embedded-helpers/errors.ts`, re-exports it via `src/agents/pi-embedded-helpers.ts`, and updates `runEmbeddedAttempt` (`src/agents/pi-embedded-runner/run/attempt.ts`) to repair the active session messages via `repairToolUseResultPairing()` and retry the prompt once when this error is detected. A dedicated vitest file adds pattern-based coverage for the new detector.
Overall this fits the existing embedded runner flow by keeping the repair localized to the prompt execution path and relying on the existing transcript repair utility instead of introducing new transcript mutation logic.
<h3>Confidence Score: 3/5</h3>
- Reasonably safe to merge, but the new detection/repair+retry path may trigger in broader cases than intended and could mask certain prompt failures depending on how `activeSession.prompt()` reports errors.
- Core approach is straightforward and uses an existing repair helper, but `isOrphanToolResultError` has a very broad match and the retry wrapper only captures thrown errors, not error-as-message outcomes. These edge cases could lead to unnecessary transcript mutation/retries or falsely successful attempts.
- src/agents/pi-embedded-runner/run/attempt.ts; src/agents/pi-embedded-helpers/errors.ts
<!-- greptile_other_comments_section -->
<sub>(3/5) Reply to the agent's comments like "Can you suggest a fix for this @greptileai?" or ask follow-up questions!</sub>
<!-- /greptile_comment -->
Most Similar PRs
#12487: fix(agents): strip orphaned tool_result when tool_use is sanitized ...
by skylarkoo7 · 2026-02-09
86.6%
#4844: fix(agents): skip error/aborted assistant messages in transcript re...
by lailoo · 2026-01-30
86.6%
#9416: fix: drop errored/aborted assistant tool pairs in transcript repair
by xandorklein · 2026-02-05
85.2%
#8270: fix: support snake_case 'tool_use' in transcript repair (#8264)
by heliosarchitect · 2026-02-03
84.9%
#20538: fix: handle orphaned tool_result errors gracefully instead of leaki...
by echoVic · 2026-02-19
84.6%
#9085: fix: improve stability for terminated responses and telegram retries
by vladdick88 · 2026-02-04
84.3%
#7525: Agents: skip errored tool calls during pairing
by justinhuangcode · 2026-02-02
84.1%
#21195: fix: suppress orphaned tool_use/tool_result errors after session co...
by ruslansychov-git · 2026-02-19
84.1%
#4700: fix: deduplicate tool_use IDs and enable sanitization for Anthropic
by marcelomar21 · 2026-01-30
83.8%
#15050: fix: transcript corruption resilience — strip aborted tool_use bloc...
by yashchitneni · 2026-02-12
83.7%