← Back to PRs

#12209: fix(skills): refresh stale skill snapshot after gateway restart

by mcaxtr open 2026-02-09 00:57 View on GitHub →
size: M trusted-contributor experienced-contributor
## Summary - Fix stale skills in existing sessions after gateway restart (#12092) - When the gateway restarts, the in-memory skills version resets to 0 while sessions retain snapshots from the prior process (version > 0) - The `shouldRefreshSnapshot` check required `snapshotVersion > 0`, so it never triggered a rebuild after restart - Add restart detection: when in-memory version is 0 but persisted version > 0, rebuild the snapshot ## Root Cause `getSkillsSnapshotVersion()` returns from in-memory state (`workspaceVersions` / `globalVersion` in `refresh.ts`), which resets to 0 on process restart. The comparison at `session-updates.ts:147-148` was: ```ts const shouldRefreshSnapshot = snapshotVersion > 0 && (nextEntry?.skillsSnapshot?.version ?? 0) < snapshotVersion; ``` Since `snapshotVersion` is 0 after restart, the condition `snapshotVersion > 0` is always false, so stale snapshots are reused forever. ## Fix Extend the condition to also detect the restart scenario: ```ts const shouldRefreshSnapshot = (snapshotVersion > 0 && persistedVersion < snapshotVersion) || (snapshotVersion === 0 && persistedVersion > 0); ``` The second clause detects: "process just started (version 0) but session has a snapshot from a prior lifetime (version > 0)" → rebuild. ## Test Plan - [x] Write failing test reproducing the restart scenario (stale snapshot returned) - [x] Confirm test fails before fix - [x] Implement fix (2-line change in `session-updates.ts`) - [x] Confirm all 4 tests pass after fix - [x] `pnpm build` passes - [x] `pnpm check` passes (lint + format) - [x] `codex review --base main` returns zero issues ### TDD: All 4 new tests fail before, pass after 1. **Restart scenario** — in-memory version 0, persisted version > 0 → rebuilds snapshot 2. **Normal operation** — in-memory version matches persisted → reuses snapshot 3. **Watcher fired** — in-memory version higher than persisted → rebuilds snapshot 4. **No prior snapshot** — no existing snapshot, version 0 → builds fresh Fixes #12092 <!-- greptile_comment --> <h2>Greptile Overview</h2> <h3>Greptile Summary</h3> This PR updates `ensureSkillSnapshot` to correctly refresh a session’s persisted skills snapshot after a gateway restart. It introduces a restart-detection clause: if the in-memory snapshot version resets to `0` (fresh process) but the session already has a persisted snapshot with `version > 0` from a prior process lifetime, the snapshot is rebuilt instead of reused indefinitely. It also adds a focused Vitest suite covering: - restart scenario (memory version 0 + persisted > 0 → rebuild) - normal reuse when versions match - rebuild when watcher bumps in-memory version above persisted - edge case when `sessionEntry` is missing but `sessionStore` contains the stale snapshot - fresh snapshot build when no prior snapshot exists. These changes fit into the existing skills refresh model where `getSkillsSnapshotVersion()` is maintained in-memory (and resets on restart), while session snapshots are persisted in the session store. <h3>Confidence Score: 5/5</h3> - This PR is safe to merge with minimal risk. - The change is narrowly scoped to the snapshot refresh decision logic, aligns with the described restart root cause, and is covered by targeted tests for restart, normal reuse, watcher refresh, and missing-sessionEntry edge cases. No additional call sites were affected beyond `ensureSkillSnapshot`, and the new logic deterministically rebuilds only when the persisted snapshot is known-stale relative to the process lifetime or version bump. - No files require special attention <!-- greptile_other_comments_section --> <!-- /greptile_comment -->

Most Similar PRs