#12254: fix: add minimum token guard to prevent double compaction after flush
stale
Cluster:
Memory Management Enhancements
## Summary
- Add a 5% context-window floor to `shouldRunMemoryFlush()` in `memory-flush.ts`
- When `totalTokens` is below 5% of the context window, skip memory flush entirely
- Prevents a second lossy compaction pass from destroying already-compressed summaries
## Root Cause
After a successful compaction the session may drop to very few tokens (e.g. 1,489 on a 200K context window). The existing guard at line 100 compares `memoryFlushCompactionCount === compactionCount`, but if the persistence of `memoryFlushCompactionCount` has a timing issue (e.g. the value is not flushed to disk before the next tick reads it), the guard fails and `shouldRunMemoryFlush()` returns `true` — triggering a second compaction on a tiny context that was already compressed.
The second pass re-summarizes the already-compressed summary, causing significant information loss.
## Fix
```typescript
// Never trigger memory flush when context usage is below 5% of the window.
const minFlushTokens = Math.max(1, Math.floor(contextWindow * 0.05));
if (totalTokens < minFlushTokens) {
return false;
}
```
This is a defense-in-depth safety net: even if the compaction-count guard has a persistence timing issue, the token floor prevents a wasteful (and destructive) second compaction.
## Test plan
- [x] `memory-flush.test.ts` — 11 tests pass (including new regression test)
- [x] New test: "skips when totalTokens is below 5% of context window (post-compaction guard)" — reproduces #12170 scenario (1,489 tokens on 200K window)
Closes #12170
<!-- greptile_comment -->
<h2>Greptile Overview</h2>
<h3>Greptile Summary</h3>
This PR adds a defensive early-return to `shouldRunMemoryFlush()` so that a memory flush (and subsequent compaction) will not re-trigger when the session token usage is extremely low (defined as <5% of the model context window). The intent is to prevent a second lossy compaction pass from running immediately after an initial compaction, even if the existing `memoryFlushCompactionCount === compactionCount` guard fails due to persistence timing (referenced as #12170).
A regression test was added to cover the reported scenario (≈1.5K tokens remaining on a 200K window), asserting that `shouldRunMemoryFlush()` returns `false` under those conditions.
<h3>Confidence Score: 5/5</h3>
- This PR is safe to merge with minimal risk.
- The change is a small, deterministic guard in `shouldRunMemoryFlush()` with a focused regression test. The new behavior is intentional and limited to low-token post-compaction situations; no correctness issues were identified in the updated logic.
- No files require special attention
<!-- greptile_other_comments_section -->
<!-- /greptile_comment -->
Most Similar PRs
#4999: fix(memory-flush): use contextTokens instead of totalTokens for thr...
by Farfadium · 2026-01-30
87.7%
#20713: fix(compaction): trigger memory flush after missed compaction cycles
by zerone0x · 2026-02-19
87.2%
#15945: fix(memory-flush): only write memoryFlushCompactionCount when compa...
by aldoeliacim · 2026-02-14
86.0%
#15196: fix: clear stale token totals after compaction
by bufordtjustice2918 · 2026-02-13
83.7%
#15173: fix(session): reset totalTokens after compaction when estimate unav...
by EnzoGaillardSystems · 2026-02-13
83.2%
#12760: fix(memory-flush): fire on every compaction cycle instead of skippi...
by lailoo · 2026-02-09
82.2%
#9012: fix(memory): resilient flush for large sessions [AI-assisted]
by cheenu1092-oss · 2026-02-04
80.2%
#17041: fix(memory-flush): add softThresholdPercent for context-relative th...
by Limitless2023 · 2026-02-15
79.9%
#19878: fix: Handle compaction when fallback model has smaller context window
by gaurav10gg · 2026-02-18
79.0%
#9620: fix: increase auto-compaction reserve buffer to 40k tokens
by Arlo83963 · 2026-02-05
78.0%