#12651: fix: prevent stale timeout from triggering duplicate message sends
agents
stale
Cluster:
Cron Job Stability Fixes
## Summary
Fixes duplicate message sends when laptop wakes from sleep. The embedded agent runner's `setTimeout` timer can fire **after** the prompt completes but **before** `clearTimeout` executes (at the next `await` yield point). This incorrectly sets `timedOut=true`, causing the caller to rebuild and re-send payloads from already-delivered `assistantTexts`.
## Root Cause
**Time line from real logs:**
- 09:47:22 UTC: Reply sent successfully via dispatcher (thread reply)
- 09:47:22 UTC: `dispatch complete (queuedFinal=true, replies=1)`
- [laptop sleeps]
- 10:34:28 UTC: `embedded run timeout` fires (600s timer, stale)
- 10:34:28 UTC: **Duplicate standalone message** sent (different code path, no replyToMessageId)
**Race condition:** Between `activeSession.prompt()` resolving (line 826-828) and `clearTimeout(abortTimer)` executing (line 880), the timer can fire if the laptop wakes during this window. JavaScript is single-threaded, so the timer callback executes at the next `await` yield point (e.g., `waitForCompactionRetry` at line 842).
## Solution
Add a `promptSettled` guard flag that prevents the stale timer callback from calling `abortRun`:
```typescript
let promptSettled = false;
const abortTimer = setTimeout(() => {
if (promptSettled) {
// Prompt already completed — timer is stale (e.g. laptop sleep-wake).
return;
}
abortRun(true);
}, timeoutMs);
// ... in the finally block after prompt() resolves:
promptSettled = true;
```
**Why this works:** The flag is set synchronously in the `finally` block (which always executes), **before** the next `await` where the timer could fire. No race condition exists because JS is single-threaded.
**Why this is safe:** If the prompt genuinely times out (hasn't completed), `promptSettled` remains `false` and `abortRun(true)` proceeds normally. The fix only affects the stale-timer scenario.
## Test Plan
- [x] Existing `attempt.test.ts` tests pass (3/3)
- [x] No regressions in `src/agents/pi-embedded-runner/` test suite (test failures on main branch also exist on fix branch — unrelated pre-existing issues)
- [ ] Manual verification: Send message → close laptop → reopen → verify no duplicate (requires actual sleep scenario)
## References
- Reported issue with logs showing exact timeline
- Code path traced through `attempt.ts` → `run.ts` → `dispatch-from-config.ts` → duplicate `deliver()` call
🤖 Generated with [Claude Code](https://claude.com/claude-code)
<!-- greptile_comment -->
<h2>Greptile Overview</h2>
<h3>Greptile Summary</h3>
This change updates the embedded runner attempt logic (`src/agents/pi-embedded-runner/run/attempt.ts`) to prevent a stale `setTimeout` callback from aborting a run after the prompt has already completed (notably after laptop sleep/wake).
A `promptSettled` guard flag is introduced: the timeout handler checks this flag and returns early if the prompt has finished, avoiding an incorrect `timedOut=true`/abort path that can cause duplicate message delivery via downstream retry/rebuild logic. The flag is set in the `finally` block immediately after `activeSession.prompt(...)` completes, before subsequent awaits (e.g., compaction retry waiting), and the timer is still cleared in the outer `finally` as before.
<h3>Confidence Score: 5/5</h3>
- This PR is safe to merge with minimal risk.
- The change is small, localized, and directly addresses a real race: a timeout firing after the prompt finishes but before `clearTimeout` runs at the next yield point. The guard is set in the prompt’s `finally` before any subsequent awaits, and the original timer cleanup remains in place, so behavior is preserved for genuine timeouts while preventing the stale-timer duplicate-send scenario.
- No files require special attention
<!-- greptile_other_comments_section -->
<!-- /greptile_comment -->
Most Similar PRs
#18432: fix(agents): clear active run state immediately on embedded timeout
by BinHPdev · 2026-02-16
78.6%
#22411: fix(cron): cancel timed-out runs before side effects
by Takhoffman · 2026-02-21
75.1%
#21828: fix: acquire session write lock in delivery mirror and gateway chat...
by inkolin · 2026-02-20
74.9%
#10327: Fix: persist original prompt to transcript, not plugin-modified pro...
by GodsBoy · 2026-02-06
74.6%
#17265: fix: abort streaming runs after 90s of inactivity
by jg-noncelogic · 2026-02-15
74.1%
#18468: fix(agents): prevent infinite retry loops in sub-agent completion a...
by BinHPdev · 2026-02-16
74.0%
#17743: fix(agents): disable orphaned user message deletion that causes ses...
by clawrl3000 · 2026-02-16
74.0%
#19414: fix: respect job timeoutSeconds for stuck runningAtMs detection
by namabile · 2026-02-17
73.8%
#6268: fix: add timeout to compaction retry to prevent session lockout
by batumilove · 2026-02-01
73.7%
#12477: fix(agents): prevent TimeoutOverflowWarning when timeout is disabled
by skylarkoo7 · 2026-02-09
73.6%