#4042: agents: add proactive compaction before request

by freedomzt open 2026-01-29 15:39 View on GitHub →

agents

## Summary - Check session tokens before sending API request - If estimated tokens exceed threshold (default 85%), run compaction **before** sending - Prevents context overflow when large prompt + existing session > context limit ## Problem Current compaction is **reactive**: request sent → API error → then compact. By the time compaction runs, there may not be enough space to even compact. ## Solution Add proactive check before `runEmbeddedAttempt()`: ```typescript const estimated = estimateMessagesTokens(session) + promptTokens; if (estimated > threshold) { await compactEmbeddedPiSessionDirect(...); // compact first } await runEmbeddedAttempt(...); // then send ``` ## New Config Options ```json { "agents": { "defaults": { "compaction": { "proactiveEnabled": true, "proactiveThresholdRatio": 0.85 } } } } ``` ## Test ```bash pnpm test src/agents/pi-embedded-runner/proactive-compaction.test.ts ``` Related: #2347  <h2>Greptile Overview</h2> <h3>Greptile Summary</h3> This PR adds a proactive compaction check to the embedded Pi agent runner: before sending an attempt, it estimates total tokens (session + prompt) and triggers `compactEmbeddedPiSessionDirect` when the estimate exceeds a configurable threshold. It introduces a new `proactive-compaction.ts` helper (session JSONL parsing + threshold resolution) and adds two new agent default config options (`proactiveEnabled`, `proactiveThresholdRatio`) with schema/type updates, plus a focused Vitest suite. The change integrates into the existing compaction flow in `src/agents/pi-embedded-runner/run.ts` alongside the existing reactive “context overflow → compact → retry” loop, aiming to avoid situations where the request overflows before compaction can even run. <h3>Confidence Score: 4/5</h3> - This PR looks safe to merge with minor edge-case fixes. - Core logic is straightforward and localized, and the new helper is covered by unit tests. Main risk is an edge case where the computed threshold becomes 0 and proactive compaction may run on every request for small-context / high-reserve configurations; also the prompt token heuristic may reduce effectiveness in some languages/inputs. - src/agents/pi-embedded-runner/proactive-compaction.ts and src/agents/pi-embedded-runner/run.ts  <sub>(2/5) Greptile learns from your feedback when you react with thumbs up/down!</sub> **Context used:** - Context from `dashboard` - CLAUDE.md ([source](https://app.greptile.com/review/custom-context?memory=fd949e91-5c3a-4ab5-90a1-cbe184fd6ce8)) - Context from `dashboard` - AGENTS.md ([source](https://app.greptile.com/review/custom-context?memory=0d0c8278-ef8e-4d6c-ab21-f5527e322f13))