← Back to PRs

#16915: fix: await compaction hooks with timeout to prevent cross-session data bleed

by maximalmargin open 2026-02-15 07:34 View on GitHub →
agents size: S
## Problem Compaction hooks (before_compaction / after_compaction) were fire-and-forget: - `process.chdir` was restored before hooks finished, so hooks ran in the wrong working directory - Message arrays were passed by reference, allowing hooks to mutate session state - No timeout meant a stuck hook could hang lane cleanup indefinitely This is especially dangerous in multi-agent setups where multiple sessions compact concurrently. Fixes #15990 ## Solution 1. **Deep-clone messages** passed to hooks using `structuredClone` (with manual fallback for non-cloneable objects) 2. **Await hook promises** via `Promise.allSettled()` in `finally` before restoring `process.chdir`/env 3. **Add 10s timeout** (`waitForHookWithTimeout`) so stuck hooks log a warning and release the lane 4. Apply the same pattern to both code paths: - `compactEmbeddedPiSessionDirect` (direct compaction) - `handleAutoCompactionStart` / `handleAutoCompactionEnd` (auto-compaction lifecycle) ## Tests - New `compact.test.ts`: verifies hooks are awaited before `process.chdir` restore, and messages are deep-cloned (mutations don't leak back) - Extended `wired-hooks-compaction.test.ts`: verifies await behavior and clone isolation in auto-compaction handlers <!-- greptile_comment --> <h3>Greptile Summary</h3> This PR fixes a real concurrency issue where compaction hooks were fire-and-forget, causing `process.chdir` to be restored before hooks finished (so hooks ran in the wrong working directory) and allowing hooks to mutate session state via shared message references. The fix is well-structured: - Deep-clone messages via `structuredClone` (with manual fallback) before passing to hooks - Await hook promises in `finally` blocks via `Promise.allSettled` before restoring `process.chdir`/env - Add a 10s timeout (`waitForHookWithTimeout`) to prevent stuck hooks from hanging lane cleanup - Apply the same pattern to both direct compaction and auto-compaction lifecycle handlers Key observations: - The `void` prefix on the now-async auto-compaction handlers in `pi-embedded-subscribe.handlers.ts` is correct — the event handler is synchronous by design, and internal error handling (`.catch`) prevents unhandled promise rejections - The type change to `PluginHookAfterCompactionEvent` (adding `messages?: unknown[]`) correctly aligns the type with runtime behavior - ~63 lines of helper code (`waitForHookWithTimeout`, `cloneMessageForHook`, `cloneMessagesForHook`) are duplicated across `compact.ts` and `pi-embedded-subscribe.handlers.compaction.ts` — extracting to a shared module would reduce maintenance burden - Tests are thorough: they verify hooks are awaited before cwd restore, messages are deep-cloned (mutations don't leak), and stuck hooks time out correctly <h3>Confidence Score: 4/5</h3> - This PR is safe to merge — it fixes a real concurrency bug with correct timeout/cleanup semantics and thorough tests. - The core logic changes are sound: hooks are properly awaited before cwd/env restoration, messages are deep-cloned to prevent mutation leaks, and timeouts prevent indefinite hangs. The only concern is code duplication of ~63 lines of helper functions across two files, which is a maintainability issue rather than a correctness issue. Tests cover the key scenarios well. - No files have correctness issues. `src/agents/pi-embedded-subscribe.handlers.compaction.ts` and `src/agents/pi-embedded-runner/compact.ts` share duplicated helper code that could be extracted to a shared module. <sub>Last reviewed commit: de0c4eb</sub> <!-- greptile_other_comments_section --> <sub>(3/5) Reply to the agent's comments like "Can you suggest a fix for this @greptileai?" or ask follow-up questions!</sub> <!-- /greptile_comment -->

Most Similar PRs