#9011: fix(session): auto-recovery for corrupted tool responses [AI-assisted]
agents
size: L
Cluster:
Error Handling in Agent Tools
## Summary
Fixes #8946 - Session should auto-recover when corrupted tool response makes history invalid
### Problem
When a tool response is corrupted (JSON parse error, truncated output, invalid UTF-8), the session history becomes invalid and the entire session fails.
### Solution
1. **Pre-persist validation** - Validate tool responses before storing:
- JSON structure validation
- UTF-8 encoding validation
2. **Graceful degradation** - When validation fails:
- Store sanitized placeholder: `[Tool output corrupted - original truncated]`
- Log original to `~/.openclaw/debug/corrupted-tool-results/`
- Continue session with warning
3. **Integration** - Applied in session-tool-result-guard-wrapper:
- Before plugin hooks (ensures plugins get valid input)
- After plugin transformations (catches plugin-introduced corruption)
### Files Changed
- `src/agents/session-tool-result-validation.ts` (new)
- `src/agents/session-tool-result-validation.test.ts` (new) - 12 tests
- `src/agents/session-tool-result-guard-wrapper.ts` (modified)
- `src/agents/session-tool-result-guard.validation-integration.test.ts` (new) - 5 tests
### Impact
- Sessions survive tool failures instead of crashing
- Corrupted outputs logged for investigation
- Conversation context preserved
### AI Disclosure
- [x] AI-assisted (Claude Code via Clawdbot)
- [x] Tests included
- [x] I understand what the code does
<!-- greptile_comment -->
<h2>Greptile Overview</h2>
<h3>Greptile Summary</h3>
This PR adds a pre-persist validation/sanitization layer for `toolResult` messages to keep sessions running when tool outputs are corrupted (non-JSON-serializable structures, invalid UTF-8, etc.). It introduces `validateAndSanitizeToolResult()` plus unit + integration tests, and wires the validator into `guardSessionManager` so tool results are validated before persistence, then again after `tool_result_persist` plugin hooks.
It also updates the Telegram monitor to enforce single-instance startup via in-memory instance state (running/starting + debounce) and adds tests for duplicate start prevention/debounce behavior.
<h3>Confidence Score: 3/5</h3>
- This PR is close to mergeable but has a real runtime handler-leak bug that should be fixed first.
- Most changes are additive and well-tested, and the tool-result sanitization approach is straightforward. However, `monitorTelegramProvider` currently installs a process-level unhandled-rejection handler and can return early (debounce/duplicate) without unregistering it, which will accumulate handlers and duplicate logging. There’s also some test env leakage and a potential debug-log overwrite edge case.
- src/telegram/monitor.ts; src/agents/session-tool-result-guard.validation-integration.test.ts; src/agents/session-tool-result-validation.ts
<!-- greptile_other_comments_section -->
<!-- /greptile_comment -->
Most Similar PRs
#3647: fix: sanitize tool arguments in session history
by nhangen · 2026-01-29
85.4%
#15050: fix: transcript corruption resilience — strip aborted tool_use bloc...
by yashchitneni · 2026-02-12
84.6%
#12487: fix(agents): strip orphaned tool_result when tool_use is sanitized ...
by skylarkoo7 · 2026-02-09
83.2%
#15649: fix: sanitize tool_use IDs on session write path
by aldoeliacim · 2026-02-13
83.1%
#14328: fix: strip incomplete tool_use blocks from errored/aborted messages...
by Kropiunig · 2026-02-12
82.7%
#19094: Fix empty tool_call_id and function names in provider transcript pa...
by yxshee · 2026-02-17
82.4%
#4844: fix(agents): skip error/aborted assistant messages in transcript re...
by lailoo · 2026-01-30
82.3%
#19024: fix: Fix normalise toolid
by chetaniitbhilai · 2026-02-17
82.3%
#21195: fix: suppress orphaned tool_use/tool_result errors after session co...
by ruslansychov-git · 2026-02-19
82.3%
#3622: fix(agents): drop orphan tool results
by mickobizzle · 2026-01-28
81.9%