#20184: feat: memory plugin compaction control

by solstead open 2026-02-18 16:28 View on GitHub →

docs agents size: L

## Summary Two plugin API additions that give memory plugins control over compaction — custom summaries and programmatic compaction triggers. Each change is isolated and backward-compatible. Builds on #13287 (`before_compaction`, `before_reset`, `extraBootstrapFiles`). - **`provide_compaction_summary` hook** — new modifying hook that lets a plugin return a custom compaction summary, bypassing the default LLM compaction call entirely. Uses `sessionManager.appendCompaction()` with `fromHook: true`. Includes a skip-counter safety gate (max 3 consecutive skips). - **`requestCompaction()` on hook context** — lets plugins trigger compaction programmatically (e.g., after idle-timeout extraction in `agent_end`). Deferred execution to avoid lane deadlocks. Both are purely additive — `before_compaction` remains fire-and-forget, new hooks only activate when registered, zero overhead for plugins that don't use them. --- ## Context Today, every compaction runs an LLM call to summarize the conversation. A memory plugin that has already extracted structured facts and relationships from the conversation can represent the same information in far fewer tokens — but has no way to tell the gateway "I've got this." Internal benchmarks with [Quaid](https://github.com/Steadman-Labs/quaid) (a local SQLite knowledge graph using sqlite-vec + FTS5 + Haiku reranker) show **~90% recall accuracy at ~12% of the input tokens** compared to full-context replay. But without these hooks, those token savings can't be captured — the gateway will still run the LLM compaction regardless of what the plugin knows. These hooks enable a pattern where memory plugins provide their own compaction summaries from their knowledge graph, saving the cost and latency of the default LLM compaction call. | Hook | Phase | Type | What It Controls | |------|-------|------|-----------------| | `before_compaction` | Compaction | void (unchanged) | Observe/extract — fire-and-forget | | `provide_compaction_summary` | Compaction | modifying (new) | Replace LLM summary with plugin summary | | `requestCompaction()` | Agent hooks | method (new) | Trigger compaction programmatically | --- ## 1. `provide_compaction_summary` — Plugin-provided compaction summary **Problem:** The `before_compaction` hook is fire-and-forget (`runVoidHook`). Plugins can observe and extract, but cannot provide a compaction summary. The gateway always runs the default LLM summarization, even when a memory plugin has already processed every message. **Solution:** Add a new modifying hook `provide_compaction_summary` that fires after `before_compaction` handlers have been dispatched. The plugin can return a custom summary string; the gateway writes it directly via `sessionManager.appendCompaction()` with `fromHook: true`, skipping the LLM call entirely. **Key implementation details:** - `tokensBefore` populated by `estimateTokens()` loop over `preCompactionMessages` - `previousSummary` extracted from last compaction entry in `sessionManager.getEntries()` - Plugin summary path uses `sessionManager.appendCompaction(pluginSummary, lastEntryId, estTokensBefore, undefined, true)` — `fromHook: true` marks it as plugin-provided - Skip-counter safety gate: `MAX_PLUGIN_SKIP_COUNT = 3` consecutive skips before forced standard compaction - Same timeout as default compaction (`EMBEDDED_COMPACTION_TIMEOUT_MS` = 300s) **Files:** `types.ts`, `hooks.ts`, `compact.ts` --- ## 2. Plugin-triggered compaction via hook context **Problem:** Plugins cannot trigger compaction programmatically. Use case: a memory plugin's idle-timeout handler (in `agent_end`) extracts facts after inactivity and wants to compact the session for the next turn. **Solution:** Add `requestCompaction(customInstructions?)` to `PluginHookAgentContext`. The method sets a deferred flag; after `agent_end` completes, the compaction is scheduled as a new lane task via dynamic import to avoid deadlocking the current session lane. **Files:** `types.ts`, `run/attempt.ts` --- ## Backward Compatibility - `before_compaction` remains fire-and-forget — existing plugins unaffected - New hooks only activate when registered — zero overhead otherwise - `requestCompaction()` is optional on the context object - `skipCompaction` has a 3-consecutive-skip safety gate to prevent unbounded context growth - `after_compaction` fires regardless of which path was taken (plugin summary or default LLM) ## Tests - 13 new vitest tests covering hook merge logic (OR-logic for `skipCompaction`, last-summary-wins), wiring (`hasHooks` detection, event passthrough, error handling), and type shapes - 5 existing compaction hook tests still pass - 119 total plugin tests pass ## Docs Updated - `docs/concepts/agent-loop.md` — added `provide_compaction_summary` to hook list, noted `requestCompaction()` on `agent_end` - `docs/concepts/compaction.md` — added "Plugin-provided summaries" section - `docs/reference/session-management-compaction.md` — added detailed section on plugin-provided compaction summary and `requestCompaction()` - `docs/tools/plugin.md` — added "Typed hooks" section with registration examples and hook category list ## Changes since initial submission - Removed `provide_session_context` (deferred to future PR — not needed yet) - Fixed: `session.compact(pluginSummary)` → `sessionManager.appendCompaction()` for true LLM bypass - Fixed: added skip-counter safety gate (`MAX_PLUGIN_SKIP_COUNT = 3`) - Fixed: populated `tokensBefore` and `previousSummary` (were hardcoded `undefined`) - Fixed: trigger label `"overflow"` → `"manual"` - Added 13 new tests and 4 doc updates Happy to answer questions or hop on Discord.  <h3>Greptile Summary</h3> Adds two plugin API hooks for memory plugins to control compaction: `provide_compaction_summary` allows plugins to provide custom summaries (bypassing LLM calls), and `requestCompaction()` enables programmatic compaction triggers. Implementation is backward-compatible with proper safety gates (3-skip limit, bounded map with 1000-entry cap, timeout protection). Previous review issues have been addressed: skip counter is implemented, `tokensBefore`/`previousSummary` are populated, `sessionManager.appendCompaction()` correctly bypasses LLM, and trigger label is `"manual"`. Tests cover hook merge logic, wiring, and error handling. Documentation updated across 4 files. - Implements plugin-provided compaction summaries with proper LLM bypass via `sessionManager.appendCompaction()` - Adds `requestCompaction()` method to agent hook context with deferred execution to avoid lane deadlocks - Safety gate prevents unbounded context growth (max 3 consecutive plugin skips before forced compaction) - Skip counter map uses bounded eviction (FIFO at 1000 entries) to prevent memory leaks from abandoned sessions - 13 new tests validate hook merge logic (OR-logic for `skipCompaction`, last-summary-wins), wiring, and error handling <h3>Confidence Score: 4/5</h3> - Safe to merge with minor considerations - well-tested backward-compatible feature addition with proper safety gates - All previous review issues have been addressed (skip counter implemented, `tokensBefore`/`previousSummary` populated, correct LLM bypass via `sessionManager.appendCompaction()`, trigger label fixed to `"manual"`). Implementation includes comprehensive safety mechanisms (3-skip limit, bounded map, timeout protection), extensive test coverage (13 new tests), and clear documentation updates. The bounded map uses FIFO eviction (not true LRU) which is acceptable for preventing unbounded growth. The deferred compaction pattern correctly avoids lane deadlocks. Minor point: the TODO comment about adding a `"plugin"` trigger variant could be addressed but is acceptable as-is. - No files require special attention - implementation is clean and previous review concerns have been resolved <sub>Last reviewed commit: f98c1f6</sub>