← Back to PRs

#10283: fix(agents): close TOCTOU race in session write lock acquisition

by programming-pupil open 2026-02-06 09:10 View on GitHub →
agents stale
## Problem `acquireSessionWriteLock` has a TOCTOU (Time-of-Check to Time-of-Use) race condition when taking over stale locks: ```typescript if (stale || !alive) { await fs.rm(lockPath, { force: true }); // Step 1: Delete stale lock continue; // Step 2: Loop back to create new lock } // RACE WINDOW: Another process can create the lock between steps 1 and 2 ``` This can cause two processes to both believe they hold the lock, leading to data corruption in session files. ## Scenario 1. Process A detects stale lock, deletes it 2. Process B (running concurrently) also detects stale lock, deletes it 3. Process A creates new lock file 4. Process B creates new lock file (overwrites A's lock!) 5. Both processes think they have exclusive access ## Solution Use atomic rename instead of delete+create: 1. Create a temporary lock file with unique name (`{lockPath}.{pid}.{timestamp}`) 2. Use `fs.rename()` to atomically replace the stale lock 3. If rename fails (another process won), clean up temp file and retry `fs.rename()` is atomic on POSIX systems, closing the race window. ## Impact - Prevents potential data corruption in concurrent session access - No behavioral change for single-process usage - Minimal performance impact (one extra file operation only when taking over stale locks) <!-- greptile_comment --> <h2>Greptile Overview</h2> <h3>Greptile Summary</h3> - Updates `acquireSessionWriteLock` to attempt “atomic” stale-lock takeover by creating a temp lock file and renaming it over the existing lock. - Adds temp-file cleanup and retries when takeover fails. - Keeps the existing in-process reentrancy counter via `HELD_LOCKS` and existing release behavior. - Main concern: the new rename-based takeover does not guarantee exclusivity on POSIX and the post-takeover `r+` open no longer represents an exclusive lock, so concurrent writers can still occur. <h3>Confidence Score: 2/5</h3> - This PR is not safe to merge as-is due to a still-present race that can allow dual lock owners during stale-lock takeover. - The intended TOCTOU fix relies on `rename()` overwriting the destination; on POSIX this can let multiple contenders succeed (last writer wins), and the stored `r+` handle after takeover doesn’t provide exclusivity. That means the core correctness guarantee (single writer) is not restored under contention. - src/agents/session-write-lock.ts <!-- greptile_other_comments_section --> <sub>(5/5) You can turn off certain types of comments like style [here](https://app.greptile.com/review/github)!</sub> <!-- /greptile_comment -->

Most Similar PRs