#5360: fix(compaction): add emergency pruning for context overflow

by sgwannabe open 2026-01-31 11:46 View on GitHub →

agents

## Problem When a session's context significantly exceeds the model's token limit (>20% over), the `safeguard` compaction mode fails to recover and falls back to "Summary unavailable". Example case: - Model: kimi-code/kimi-for-coding (limit: 262,144 tokens) - Request: 321,052 tokens (22% over limit) - Result: Auto-compaction triggered but returned "Summary unavailable" ## Root Cause The `pruneHistoryForContextShare()` function only prunes based on `maxHistoryShare` (default 50%), not the actual context window limit. When the entire context is already over the limit, this pruning is insufficient. ## Solution Add **emergency pruning** that: 1. Detects when total tokens exceed the context window 2. Aggressively drops oldest messages until under the limit 3. Targets 85% of context window to leave room for summary generation ## Changes - Added `emergencyPruneToFitContext()` function in `compaction-safeguard.ts` - Integrated emergency pruning into the compaction flow - Added comprehensive tests for the new function ## Testing - Added test suite with 7 test cases covering: - Messages under target (no pruning) - Messages over target (pruning occurs) - Default 85% target ratio - Custom target ratios - Empty message arrays - Oldest-first dropping behavior Fixes #5357  <h2>Greptile Overview</h2> <h3>Greptile Summary</h3> This PR adds an “emergency pruning” path to the compaction safeguard extension to handle sessions whose message history exceeds the model’s context window by a large margin. It introduces `emergencyPruneToFitContext()` (exported via `__testing`) and wires it into the `session_before_compact` flow to drop oldest messages until history fits a target fraction of the context window (85% by default), then continues with staged summarization. A new test suite exercises the helper’s basic behaviors (no-op under target, dropping oldest first, custom ratios, empty inputs). <h3>Confidence Score: 3/5</h3> - This PR is reasonably safe to merge, but there are a couple of edge-case logic issues that could still allow context overflow in extreme scenarios. - Core approach is straightforward and localized, but the emergency-prune loop mixes token estimators and the overflow calculation omits turn-prefix tokens; both can undermine the guarantee that the post-prune prompt fits the context window. - src/agents/pi-extensions/compaction-safeguard.ts (emergency pruning logic and overflow calculations); src/agents/pi-extensions/compaction-safeguard.test.ts (token-estimation-dependent assertions).  <sub>(2/5) Greptile learns from your feedback when you react with thumbs up/down!</sub> **Context used:** - Context from `dashboard` - CLAUDE.md ([source](https://app.greptile.com/review/custom-context?memory=fd949e91-5c3a-4ab5-90a1-cbe184fd6ce8)) - Context from `dashboard` - AGENTS.md ([source](https://app.greptile.com/review/custom-context?memory=0d0c8278-ef8e-4d6c-ab21-f5527e322f13))