#5057: fix: add per-tool-result hard cap to prevent context overflow

by hanxiao open 2026-01-31 00:52 View on GitHub →

agents

Cluster: Context Management Enhancements

## Problem Individual tool results can be extremely large and blow out the context window before the ratio-based pruner gets a chance to act: - `exec`: 200,000 chars (~50K tokens) - `browser snapshot`: 80,000 chars (~20K tokens) - `web_fetch`: 50,000 chars The existing context pruning (`pruneContextMessages`) is TTL-gated (5-minute cooldown) and ratio-based, meaning it only kicks in *after* the context is already 30%+ full. Multiple large tool results in a single turn can overflow the context before pruning runs. This mirrors how Claude Code handles it: hard-truncate tool results at the boundary where they enter the context. ## Changes ### 1. Per-tool-result hard cap (`capToolResultMessages`) New function that caps each individual tool result text to `maxToolResultChars` (default: 50,000 chars). Runs on **every context event**, independent of cache-TTL gating. Uses 60/40 head/tail split to preserve both the beginning and end of tool output. ### 2. Extension restructured The context pruning extension now: 1. **Always** applies per-result cap (not TTL-gated) 2. **Then** applies ratio-based pruning (TTL-gated, as before) ### 3. New config option ```json { "agents": { "defaults": { "contextPruning": { "maxToolResultChars": 50000 } } } } ``` Set to `0` to disable. ### 4. Reduced exec DEFAULT_MAX_OUTPUT From 200K to 80K chars (still overridable via `PI_BASH_MAX_OUTPUT_CHARS` env var). ## Tests All 17 existing tests pass + 6 new tests for `capToolResultMessages`: - No-op when results are under the cap - Truncates oversized results with head/tail preservation - Disabled when maxChars = 0 - Skips image-containing tool results - Truncates multiple oversized results in one pass - Extension applies cap even when TTL has not expired  <h2>Greptile Overview</h2> <h3>Greptile Summary</h3> This PR adds a per-tool-result hard cap (`capToolResultMessages`) to truncate oversized tool outputs before they enter the context window, then keeps the existing ratio-based pruning (TTL-gated) as a second stage. It also reduces the default bash `exec` max output from 200k to 80k chars and introduces a new `maxToolResultChars` setting (0 disables) with tests covering truncation behavior and TTL interaction. The changes fit into the existing context-pruning extension by adding an always-on “boundary truncation” step in `src/agents/pi-extensions/context-pruning/extension.ts`, ensuring large tool results can’t overflow context before the ratio-based pruner has a chance to run. <h3>Confidence Score: 4/5</h3> - This PR is generally safe to merge and is unlikely to introduce runtime failures. - The change is localized to context pruning behavior with accompanying unit tests; main concern is behavioral: the new hard cap ignores existing tool allow/deny pruning rules, which could surprise configurations relying on those semantics. - src/agents/pi-extensions/context-pruning/pruner.ts and src/agents/pi-extensions/context-pruning/extension.ts  <sub>(3/5) Reply to the agent's comments like "Can you suggest a fix for this @greptileai?" or ask follow-up questions!</sub>