#16261: feat(agents): add two-tier tool output truncation and excludeFromContext support

by ProgramCaiCai open 2026-02-14 14:55 View on GitHub →

app: macos app: web-ui gateway agents stale size: XL

Cluster: Context Management Enhancements

Closes #16784 #### Summary Prevent tool outputs from bloating session context and causing compaction failures. Adds two-tier truncation (hard byte/line caps + configurable head+tail pruning), `excludeFromContext` support for exec/read/web-fetch tools, context-safe limits for gateway/nodes/canvas/browser/sessions tools, and a `safeLimit` parameter for chat history. lobster-biscuit #### Use Cases - Long-running sessions hitting context limits due to large tool outputs (exec stdout, web-fetch pages, gateway config dumps) - Preventing compaction failures caused by oversized tool results persisted in session history - Allowing tools to write full output to artifact files while keeping only a preview in context #### Behavior Changes - Tool outputs are now hard-capped at 12KB / 400 lines (6KB / 200 lines for exec) before being emitted as agent events - `excludeFromContext: true` parameter available on exec, read, and web-fetch tools — writes full output to artifact file, returns preview in context - Gateway tool returns compact summaries for config.get by default; config.schema output is capped - Nodes tool truncates oversized run/invoke payloads at 16K chars with `...(truncated)...` marker - Canvas eval results are truncated when oversized - Sessions list/history tools enforce safe limits on message counts - Chat history accepts `safeLimit` parameter to cap returned messages - Context pruning defaults updated: `maxToolResultChars` lowered from 12000 to 8000 #### Existing Functionality Check - [x] I searched the codebase for existing functionality. Searches performed: - Searched for existing truncation in `tool-result-truncation.ts` — refactored and unified - Searched for `excludeFromContext` patterns — new capability, no prior implementation - Searched for hard cap constants — new `tool-output-hard-cap.ts` module #### Tests - 18 new test files covering all new functionality (6069+ lines of test code) - `bash-tools.test.ts`: exec backgrounding, excludeFromContext artifacts - `openclaw-gateway-tool.test.ts`: compact summaries, config.schema caps - `openclaw-tools.camera.test.ts`: camera snap, run/invoke truncation - `openclaw-tools.canvas.test.ts`: eval result truncation - `openclaw-tools.sessions.test.ts`: list/history safe limits - `pi-embedded-subscribe.handlers.tools.hard-cap.test.ts`: event emission hard caps - `context-pruning.test.ts`: pruning settings and defaults - `pi-tools.read.exclude-from-context.test.ts`: read tool artifact output - `session-tool-result-guard.test.ts`: guard truncation and persist hooks - `tool-output-hard-cap.test.ts`: hard cap unit tests - `tool-output-hard-truncate.test.ts`: head+tail truncation - `browser-tool.test.ts`: browser tool context limits - `web-fetch.exclude-from-context.test.ts`: web-fetch artifact output - `web-tools.fetch.test.ts`: fetch tool integration - `chat.history.safe-limit.test.ts`: chat history safeLimit - `config.pruning-defaults.test.ts`: updated default assertions - All ctx-safe tests pass (822 test files, 6075 passed, 2 skipped — lobster timeout is pre-existing/env) **Sign-Off** - Models used: claude-opus-4-6 - Submitter effort: high — fixed 3 test failures (hard caps not applied in event handlers, nodes run missing bounded result, web-fetch artifact size assertion misaligned with HARD_FETCH_MAX_CHARS_CAP), regenerated protocol schema for safeLimit param - Agent notes: lobster-tool.test.ts failures are pre-existing on origin/main (subprocess timeout), unrelated to this PR  <h3>Greptile Summary</h3> Implements two-tier tool output truncation with hard byte/line caps (12KB/400 lines default, 6KB/200 lines for exec) applied before event emission, plus `excludeFromContext` support for exec, read, and web-fetch tools that writes full output to artifact files while returning preview-only content. Gateway/nodes/canvas/browser/sessions tools now enforce context-safe limits, and chat history accepts `safeLimit` parameter (defaults to 10/50 vs 200/1000). Context pruning default `maxToolResultChars` lowered from 12000 to 8000. **Key implementation details:** - `tool-output-hard-cap.ts`: Head+tail truncation algorithm with iterative budget scaling preserves actionable content (headers + error tails) within fixed limits - `tool-output-hard-truncate.ts`: Legacy wrapper for backward compatibility during transition - Event handlers (`pi-embedded-subscribe.handlers.tools.ts`) apply `hardCapToolOutput` to all tool results/updates before emission - Session persistence guard (`session-tool-result-guard.ts`) applies caps before and after hook transforms to enforce limits consistently - Artifact output (`tool-output-artifacts.ts`) writes to `.openclaw/artifacts/{tool}/` or temp dir with 4KB preview in context - Gateway tool returns compact summaries for `config.get` by default; `config.schema` output capped at 20KB - Web-fetch hard cap set to 5KB (down from 50KB) to prevent context bloat - Protocol schema regenerated for `safeLimit` parameter across TypeScript and Swift **Test coverage (6069+ lines):** 18 new test files cover hard cap unit tests, event emission caps, artifact file creation, tool-specific truncation (gateway config summaries, nodes payload bounds, canvas eval truncation), session guard behavior, context pruning defaults, and excludeFromContext for all supported tools. <h3>Confidence Score: 4/5</h3> - This PR is safe to merge with low risk — implementation is well-architected with comprehensive test coverage - Score reflects thorough implementation with 18 dedicated test files, systematic application of caps at both event emission and persistence layers, and 3 bug fixes applied during development (hard caps in handlers, nodes bounded results, web-fetch artifact assertions). The two-tier approach (hard caps + configurable pruning) provides defense in depth against context bloat. Minor risk from complexity of head+tail truncation algorithm but mitigated by unit tests validating edge cases. - Pay close attention to `tool-output-hard-cap.ts` (iterative budget scaling could have edge cases) and `session-tool-result-guard.ts` (double-capping logic before/after hooks) <sub>Last reviewed commit: 50b27de</sub>