#13796: fix: skip recomputing nextRunAtMs for running cron jobs (#13739)
Cluster:
Cron Job Management Fixes
## fix: skip recomputing nextRunAtMs for running cron jobs (#13739)
### Problem
Daily cron jobs systematically skip their scheduled execution, with `nextRunAtMs` advancing 48 hours instead of 24 hours.
### Root Cause
`recomputeNextRuns()` does not skip jobs that are currently running (`runningAtMs` is set). When a concurrent operation (e.g., `list`, `status`, `add`) triggers `recomputeNextRuns` while a job is executing, the function sees the job's `nextRunAtMs` as past-due and advances it to the next period (e.g., tomorrow). This effectively skips the current execution's post-run scheduling in `applyJobResult`.
The race condition:
1. `onTimer` finds a due job, sets `runningAtMs`, persists, releases lock
2. Job starts executing (outside the lock)
3. Another operation (e.g., `list`) acquires the lock, calls `recomputeNextRuns`
4. `recomputeNextRuns` sees `nextRunAtMs` is past-due, advances it to tomorrow
5. `onTimer` finishes, `applyJobResult` computes `nextRunAtMs` = tomorrow (same value)
6. Result: job ran today but `nextRunAtMs` jumped to day-after-tomorrow
### Fix
Add a guard in `recomputeNextRuns` to skip jobs where `runningAtMs` is set (i.e., currently executing). The post-run scheduling in `applyJobResult` will handle setting the correct `nextRunAtMs` after execution completes.
Closes #13739
<!-- greptile_comment -->
<h2>Greptile Overview</h2>
<h3>Greptile Summary</h3>
Adds a guard in `recomputeNextRuns` to skip jobs with `runningAtMs` set, preventing a race condition where concurrent operations (`list`, `status`, `add`) could advance `nextRunAtMs` while a job is executing, causing the job to skip its next scheduled run (e.g., 48-hour gap instead of 24 hours for daily jobs).
**Key changes:**
- Adds check at `src/cron/service/jobs.ts:128-130` to skip recomputation for running jobs
- Guard is placed after stuck-run cleanup but before the recompute logic
- Post-run scheduling in `applyJobResult` handles setting correct `nextRunAtMs` after execution
**How it works:**
The fix prevents the race condition where:
1. `onTimer` sets `runningAtMs` and releases lock
2. Job executes (outside lock)
3. Another operation acquires lock, calls `recomputeNextRuns`
4. Without this fix: `recomputeNextRuns` sees past-due `nextRunAtMs` and advances it
5. With this fix: `recomputeNextRuns` skips the job, letting `applyJobResult` schedule correctly after execution
The guard is consistent with existing patterns used in `findDueJobs`, `runMissedJobs`, `runDueJobs`, and the `run` operation.
<h3>Confidence Score: 5/5</h3>
- This PR is safe to merge with minimal risk
- The fix is surgical, well-documented, and follows existing patterns throughout the codebase. The guard is placed correctly after stuck-run cleanup, ensuring genuinely stuck jobs are still recovered. The change addresses a clear race condition with a minimal, defensive check that aligns with how `runningAtMs` is already used in multiple other locations (`findDueJobs`, `runMissedJobs`, `runDueJobs`, `isJobDue`, and the `run` operation).
- No files require special attention
<!-- greptile_other_comments_section -->
<sub>(4/5) You can add custom instructions or style guidelines for the agent [here](https://app.greptile.com/review/github)!</sub>
<!-- /greptile_comment -->
Most Similar PRs
#12443: fix(cron): don't advance past-due jobs that haven't been executed
by rummangeminicode · 2026-02-09
90.0%
#12982: fix(cron): prevent status/list from advancing overdue job nextRunAtMs
by hclsys · 2026-02-10
89.4%
#17838: fix: prevent cron job spin loop by not recomputing nextRunAtMs for ...
by MisterGuy420 · 2026-02-16
88.6%
#11108: fix(cron): prevent missed jobs from being skipped on timer recompute
by Bentlybro · 2026-02-07
88.2%
#14667: fix: preserve missed cron runs when updating job schedule
by WalterSumbon · 2026-02-12
85.7%
#12747: fix: catch up missed cron-expression job runs on restart
by obin94-commits · 2026-02-09
85.6%
#12448: fix: prevent cron list/status from silently skipping due jobs
by Yida-Dev · 2026-02-09
84.6%
#16132: fix(cron): prevent duplicate job fires via MIN_REFIRE_GAP_MS guard
by widingmarcus-cyber · 2026-02-14
84.4%
#10918: fix(cron): add tolerance for timer precision and skip due jobs in r...
by Cherwayway · 2026-02-07
84.4%
#9684: fix: cron race condition - run due jobs before recomputing nextRunA...
by divol89 · 2026-02-05
83.8%