← Back to PRs

#11657: fix(cron): treat skipped heartbeat as ok for one-shot jobs

by DukeDeSouth open 2026-02-08 03:47 View on GitHub →
size: S
## Human View ### Summary Fixes #11612. When a one-shot cron job (`schedule.kind="at"`, `deleteAfterRun: true`) runs with `wakeMode: "now"`, the system event is enqueued **before** the synchronous heartbeat attempt. If the heartbeat returns `"skipped"` (e.g. another request in flight), the job was previously marked as `"skipped"` — which prevented both `deleteAfterRun` cleanup and the `computeJobNextRunAtMs` guard from returning `undefined`. Since the original `atMs` is in the past, the job became immediately due again, causing **infinite re-fires** (the exact symptom reported in #11612: job fired at 01:51, then again at 02:11, 02:13, 02:14). #### Root cause ``` enqueueSystemEvent(text) ← event delivered runHeartbeatOnce() ← returns "skipped" finish("skipped") ← shouldDelete requires "ok" computeJobNextRunAtMs() ← lastStatus="skipped" ≠ "ok" → returns past atMs → job is due again → re-fires → event delivered AGAIN → loop ``` #### Fix The system event is already in the queue and **will** be processed by the next natural heartbeat cycle. Therefore, both `"ran"` and `"skipped"` heartbeat results should be treated as `"ok"` — only `"error"` remains a failure. This is consistent with the existing `wakeMode: "next-heartbeat"` path (line 182), which already finishes with `"ok"` unconditionally after enqueuing. #### Changes - **`src/cron/service/timer.ts`** — In `executeJob()`, swap the `if/else` to only check for `"error"`, treating everything else as `"ok"`. - **Test** — Adds a regression test that verifies a one-shot `deleteAfterRun` job is properly removed even when the synchronous heartbeat reports `"skipped"`. ### Test plan - [x] New test: `deletes one-shot job even when heartbeat is skipped (#11612)` — passes - [x] All 59 existing cron tests pass (13 test files, 0 failures) - [x] No changes to recurring cron jobs (`every`, `cron` schedule kinds) --- ## AI View (DCCE Protocol v1.0) ### Metadata - **Generator**: Claude (Anthropic) via Cursor IDE - **Methodology**: AI-assisted development with human oversight and review ### AI Contribution Summary - Solution design and implementation ### Verification Steps Performed 1. Reproduced the reported issue 2. Analyzed source code to identify root cause 3. Implemented and tested the fix ### Human Review Guidance - Core changes are in: `src/cron/service/timer.ts`, `package-lock.json` Made with M7 [Cursor](https://cursor.com) <!-- greptile_comment --> <h2>Greptile Overview</h2> <h3>Greptile Summary</h3> This change updates the cron timer’s `executeJob()` main-session path to treat a synchronous heartbeat result of `"skipped"` the same as `"ran"` (i.e., finish the job with status `"ok"`), only treating `"error"` as a failure. This prevents one-shot (`schedule.kind="at"`) jobs with `deleteAfterRun: true` and `wakeMode: "now"` from being left in a `"skipped"` state and immediately re-firing. A regression test is added to cover the `runHeartbeatOnce() -> {status:"skipped"}` case and ensure the one-shot job is removed after enqueuing its system event. <h3>Confidence Score: 3/5</h3> - This PR is close to safe to merge but needs cleanup and a test fix. - The core cron logic change is small and aligns with the intended behavior for one-shot jobs, but the PR includes an unrelated `package-lock.json` addition and the new regression test asserts against a potentially stale job object rather than re-reading persisted state. - package-lock.json; src/cron/service.runs-one-shot-main-job-disables-it.test.ts <!-- greptile_other_comments_section --> <!-- /greptile_comment -->

Most Similar PRs