#15193: fix(heartbeat): keep scheduler chain alive on requests-in-flight skip
stale
size: XS
Cluster:
Heartbeat Management Improvements
## Summary
Fix heartbeat runner control flow so `requests-in-flight` means **defer**, not early exit.
## Problem
`requests-in-flight` is documented as "skip now, retry later", but the current loop exits early in this branch, making scheduling continuity branch-dependent.
## Root cause
In `src/infra/heartbeat-runner.ts`, the `requests-in-flight` path returns from inside the per-agent loop.
## Fix
In that branch, switch to loop continuation (`continue`) so execution reaches the shared end-of-run scheduling path.
This complements:
- #14901 (scheduler continuity when `runOnce()` throws)
- #14396 (busy-lane skip conditions)
## Validation
- ✅ `src/infra/heartbeat-wake.test.ts`
- ✅ `src/infra/heartbeat-runner.ghost-reminder.test.ts`
## Risk
Minimal diff, no config/API changes, preserves intended "skip + retry later" semantics.
<!-- greptile_comment -->
<h2>Greptile Overview</h2>
<h3>Greptile Summary</h3>
This PR changes the heartbeat runner’s per-agent control flow so the `requests-in-flight` skip path uses `continue` instead of returning early, aiming to keep the shared end-of-run scheduling path (`scheduleNext()`) consistent.
However, `startHeartbeatRunner` is also the wake handler used by `heartbeat-wake.ts`, and that wake layer depends on the runner returning `{ status: "skipped", reason: "requests-in-flight" }` to trigger a short retry backoff. With the new `continue`, the runner can lose that skip reason and fall through to `{ status: "skipped", reason: isInterval ? "not-due" : "disabled" }`, which prevents the wake layer from retrying promptly under main-lane load.
<h3>Confidence Score: 2/5</h3>
- This PR has a functional regression in the wake retry control flow and is not safe to merge as-is.
- Although the change keeps `scheduleNext()` reachable, it causes `startHeartbeatRunner` to stop returning the `requests-in-flight` skip reason that `heartbeat-wake.ts` uses to schedule a fast retry. Under load, this can delay or suppress retries and misreport manual/exec/hook wakes as "disabled".
- src/infra/heartbeat-runner.ts
<sub>Last reviewed commit: a23f144</sub>
<!-- greptile_other_comments_section -->
<sub>(3/5) Reply to the agent's comments like "Can you suggest a fix for this @greptileai?" or ask follow-up questions!</sub>
<!-- /greptile_comment -->
Most Similar PRs
#19270: fix: retry event-driven heartbeats blocked by requests-in-flight
by deggertsen · 2026-02-17
83.7%
#17801: fix(heartbeat): enforce interval guard for non-interval wake reasons
by aldoeliacim · 2026-02-16
81.4%
#12786: fix: drop heartbeat runs that arrive while another run is active
by mcaxtr · 2026-02-09
81.0%
#19745: fix(heartbeat): enforce interval check regardless of trigger source
by misterdas · 2026-02-18
80.7%
#13574: fix(heartbeat): skip when session has active run
by 1kuna · 2026-02-10
80.6%
#14241: fix(heartbeat): propagate originating session key for exec event qu...
by aldoeliacim · 2026-02-11
79.2%
#15575: fix(heartbeat): suppress prefixed HEARTBEAT_OK ack replies (#15505)
by TsekaLuk · 2026-02-13
76.4%
#21014: fix(cron): suppress main-session summary for HEARTBEAT_OK responses
by nickjlamb · 2026-02-19
76.2%
#22277: fix: prevent heartbeat model override from bleeding into main session
by zhangjunmengyang · 2026-02-21
76.1%
#11657: fix(cron): treat skipped heartbeat as ok for one-shot jobs
by DukeDeSouth · 2026-02-08
75.9%