#15322: feat: post-compaction target token trimming + fallback strategy

by echoVic open 2026-02-13 08:46 View on GitHub →

agents size: M

## Summary Adds two new optional fields to `AgentCompactionConfig`: - **`targetTokens`** — target token count after compaction. If the compacted session still exceeds this limit, older messages are trimmed while preserving the compaction summary and newest messages. Default: `contextTokens * 0.25`. - **`fallbackRetainPercent`** — when the compaction LLM call fails entirely, retain the newest N% of messages instead of returning an error. Default: `0.2` (20%). ## Problem 1. After LLM-based compaction, the resulting session size is unpredictable — it depends entirely on how much the LLM summarizes. There's no upper bound. 2. When the compaction LLM call fails (network error, rate limit, etc.), the session returns an error with no recovery, leaving the context bloated. ## Changes ### `src/config/types.agent-defaults.ts` - Added `targetTokens?` and `fallbackRetainPercent?` to `AgentCompactionConfig` ### `src/agents/pi-embedded-runner/compaction-trimming.ts` (new) - `trimToTargetTokens(messages, targetTokens)` — trims oldest messages (preserving index 0 / compaction summary) until within budget, then repairs orphaned tool pairs - `fallbackCompact(messages, retainPercent)` — retains newest N% of messages with tool pair repair - Exports `DEFAULT_COMPACTION_TARGET_RATIO` (0.25) and `DEFAULT_FALLBACK_RETAIN_PERCENT` (0.2) ### `src/agents/pi-embedded-runner/compact.ts` - After `session.compact()` succeeds: checks if `tokensAfter > targetTokens` and applies `trimToTargetTokens` if needed - Wraps `session.compact()` in try/catch: on failure, applies `fallbackCompact` instead of propagating the error ### `src/agents/pi-embedded-runner/compaction-trimming.test.ts` (new) - 16 unit tests covering trimming, fallback, edge cases, and constants ## Backward Compatibility - Both new config fields are optional with sensible defaults - No change to existing behavior when `targetTokens` is not set and `contextTokens` is not configured - Existing compaction modes (default/safeguard) are unchanged Signed-off-by: echoVic <nicepeng@foxmail.com>  <h2>Greptile Overview</h2> <h3>Greptile Summary</h3> This PR adds two new compaction tuning knobs (`targetTokens`, `fallbackRetainPercent`) and introduces post-compaction trimming logic plus a fallback path when the LLM compaction call throws. Main functional flow changes are in `src/agents/pi-embedded-runner/compact.ts`: after `session.compact()` succeeds it estimates the token count and (optionally) trims older messages down to a configured target; if `session.compact()` fails it retains the newest N% of messages instead of surfacing an error. Issue to fix before merge: - The fallback path reports `ok: true, compacted: true` and returns placeholder compaction metadata (`firstKeptEntryId: ""`, `tokensBefore: 0`) even though the code only trims/repairs messages and does not create a real compaction entry. Downstream logic increments `compactionCount` based on `compacted: true`, and other parts of the system treat compaction metadata as reflecting an actual compaction event, so this makes session state/metadata inconsistent on LLM-compaction failure. <h3>Confidence Score: 3/5</h3> - This PR is moderately safe but has a correctness issue in the fallback compaction reporting/metadata. - Core trimming/fallback logic is straightforward and covered by unit tests, but the new fallback path returns `compacted: true` with placeholder compaction metadata even though it doesn’t create a real compaction entry, which can make session metadata inconsistent and mislead downstream accounting/UI. - src/agents/pi-embedded-runner/compact.ts <sub>Last reviewed commit: f2e71ab</sub>