#14024: feat(agents): add structured tool reflection for error recovery
agents
stale
Cluster:
Error Handling in Agent Tools
## Summary
Implements the **"Reflect, then Call"** pattern for tool error recovery, inspired by [Failure Makes the Agent Stronger (arxiv:2509.18847)](https://arxiv.org/abs/2509.18847).
When a tool call fails, instead of blind retry, this module provides structured diagnostic context — classifying the failure type, suggesting corrective actions, and tracking repeated failures to prevent infinite loops.
## What it does
### Error Classification (`classifyToolError`)
Categorizes tool errors into 11 types:
- `permission_denied`, `not_found`, `invalid_params`, `timeout`, `rate_limit`
- `format_error`, `size_limit`, `connection_error`, `auth_error`, `conflict`, `unknown`
Uses regex-based pattern matching against common error messages and codes (ENOENT, EACCES, ETIMEDOUT, etc.).
### Failure Tracking (`ToolFailureTracker`)
Lightweight in-memory tracker per session that:
- Fingerprints failures by `toolName:category`
- Detects repeated error patterns within a configurable time window
- Escalates suggestions after 3+ repeats ("STOP and try a fundamentally different approach")
- Auto-evicts old records to stay bounded
### Structured Annotation (`annotateToolResultWithReflection`)
Appends a structured reflection block to error tool results:
```
───── Structured Reflection ─────
📋 Error Category: not_found
🔍 Diagnosis: Tool 'read' failed — Resource Not Found.
⚠️ Repeat: #3 occurrence of this error pattern
💡 Suggested Actions:
• ⚠️ This is attempt #3 with the same error pattern. STOP and try a fundamentally different approach.
• Verify the path or resource name — check for typos
• Use 'read' or 'exec(ls)' to list available files in the directory
─────────────────────────────────
```
### Safety
- Does not modify existing code — purely additive (2 new files)
- Idempotent: won't double-annotate already-reflected messages
- Only annotates error results (checked via `isError` flag + text pattern heuristics)
- Bounded memory: tracker evicts old records, configurable max size
## Scores
- **Relevance:** 9/10 — directly addresses agent autonomy gap (no structured error recovery existed)
- **Feasibility:** 7/10 — self-contained module, clean integration path via existing `tool_result_persist` hook
- **Impact:** 8/10 — reduces redundant tool calls and prevents infinite retry loops
- **Total:** 24/30
## Source
- Paper: [Failure Makes the Agent Stronger: Structured Reflection for Tool Interactions](https://arxiv.org/abs/2509.18847)
## Test Results
- **46 tests**, all passing ✅
- Covers: error classification, failure tracking, reflection building, annotation, edge cases
- Existing agent tests unaffected (no code modifications)
## Files Changed
- `src/agents/tool-reflection.ts` — core module (585 lines)
- `src/agents/tool-reflection.test.ts` — tests (388 lines)
## Integration Path
The `annotateToolResultWithReflection` function is designed to be used via the existing `tool_result_persist` plugin hook, which can transform tool result messages before they are persisted to the session transcript. A follow-up PR can wire this into the hook pipeline.
---
*Generated by the OpenClaw self-improvement loop*
<!-- greptile_comment -->
<h2>Greptile Overview</h2>
<h3>Greptile Summary</h3>
Adds a new `src/agents/tool-reflection.ts` utility module that classifies tool-call failures, tracks repeated failure patterns per session, and can append a structured “reflection” block onto toolResult error messages. Includes a comprehensive Vitest suite covering classification, tracking, formatting, and idempotent annotation behavior.
The module is intended for future integration via the existing `tool_result_persist` transform hook, so tool results can be annotated before being persisted into the transcript.
<h3>Confidence Score: 4/5</h3>
- Mostly safe to merge, but reflection annotation placement and error-detection mismatch should be fixed first.
- The change is additive and well-tested, but there are two functional issues: (1) reflection may be appended to the wrong text block in multi-block tool results, and (2) `isToolResultError()` documentation claims `error`-field support that isn’t implemented, which can cause missed annotations for certain message shapes.
- src/agents/tool-reflection.ts
<!-- greptile_other_comments_section -->
<sub>(2/5) Greptile learns from your feedback when you react with thumbs up/down!</sub>
<!-- /greptile_comment -->
Most Similar PRs
#3362: fix: auto-repair and retry on orphan tool_result errors
by samhotchkiss · 2026-01-28
77.6%
#11825: fix: keep tool_use/tool_result pairs together during session compac...
by C31gordon · 2026-02-08
76.8%
#14328: fix: strip incomplete tool_use blocks from errored/aborted messages...
by Kropiunig · 2026-02-12
76.6%
#7525: Agents: skip errored tool calls during pairing
by justinhuangcode · 2026-02-02
76.6%
#8312: fix: add logging and markers for tool result repair
by ekson73 · 2026-02-03
75.9%
#22516: fix: add resilient tool registration with per-tool error isolation
by white-rm · 2026-02-21
75.8%
#9011: fix(session): auto-recovery for corrupted tool responses [AI-assisted]
by cheenu1092-oss · 2026-02-04
75.5%
#12487: fix(agents): strip orphaned tool_result when tool_use is sanitized ...
by skylarkoo7 · 2026-02-09
75.4%
#21195: fix: suppress orphaned tool_use/tool_result errors after session co...
by ruslansychov-git · 2026-02-19
75.0%
#3647: fix: sanitize tool arguments in session history
by nhangen · 2026-01-29
74.6%