← Back to PRs

#16132: fix(cron): prevent duplicate job fires via MIN_REFIRE_GAP_MS guard

by widingmarcus-cyber open 2026-02-14 09:42 View on GitHub →
stale size: S trusted-contributor
## fix(cron): prevent duplicate job fires via MIN_REFIRE_GAP_MS guard Fixes #16094 ### Problem After upgrading to 2026.2.13, cron jobs fire 2-3 times at the same scheduled time instead of once. The reporter observed: | Run | Start time | Duration | |-----|-----------|----------| | 1 | 16:00:00.009 | 155,721ms | | 2 | 16:02:29.514 | 9,534ms | Note: Run 2 started **6 seconds before** Run 1 finished — they overlapped. ### Root Cause When a long-running isolated cron job exceeds `MAX_TIMER_DELAY_MS` (60s), the re-arm logic in `onTimer` sets a recurring 60s timer to keep the scheduler alive. When the job finishes: 1. The `finally` block sets `state.running = false` and calls `armTimer(state)` 2. `armTimer` clears `state.timer` and sets a new timer 3. **But**: a re-arm timer callback may have already fired and queued itself in Node.js's event loop macrotask queue before `clearTimeout` could cancel it 4. The queued callback runs, finds `state.running === false`, enters the main execution path 5. `findDueJobs` finds the same job still "due" (because `nextRunAtMs` hasn't been recomputed yet or was reloaded from disk with the old value) 6. The job fires again → **duplicate execution** ### Fix Add a `MIN_REFIRE_GAP_MS` (30s) guard to `findDueJobs`, `runMissedJobs`, and `runDueJobs`: if a job's `lastRunAtMs` is more recent than the gap, skip it. This is a defense-in-depth measure that catches the race regardless of which code path triggers re-entry. The 30s gap is conservative — cron expressions have 60s minimum granularity, so a legitimate re-fire can never happen within 30s. ### Changed Files | File | Change | |------|--------| | `src/cron/service/timer.ts` | Add `MIN_REFIRE_GAP_MS` constant and guard in `findDueJobs`, `runMissedJobs`, `runDueJobs` | | `src/cron/service.prevents-refire-within-gap.test.ts` | New test: verifies recently-run jobs are skipped, old jobs still fire | ### Testing - All 103 cron tests pass (101 existing + 2 new) - Lint + format pass (oxlint + oxfmt) <!-- greptile_comment --> <h3>Greptile Summary</h3> Adds schedule-aware `MIN_REFIRE_GAP_MS` guard to prevent duplicate cron job executions caused by timer re-arm race conditions. The fix correctly addresses #16094 where long-running jobs could fire multiple times due to queued timer callbacks executing before `nextRunAtMs` was recomputed. Key changes: - Added `minRefireGapMs()` helper with schedule-aware logic: 30s for `cron` (60s granularity), `everyMs/2` capped at 30s for `every`, 0s for `at` (one-shot) - Guards added to `findDueJobs`, `runMissedJobs`, and `runDueJobs` to skip jobs where `now - lastRunAtMs < minRefireGapMs(job)` - Comprehensive test coverage including edge cases for short-interval `every` schedules The implementation is defense-in-depth and doesn't rely on fixing the underlying race condition, making it robust against re-entry from any code path. <h3>Confidence Score: 5/5</h3> - This PR is safe to merge with minimal risk - The implementation is well-designed with schedule-aware logic that prevents false positives, comprehensive test coverage validates the fix including edge cases, and the defense-in-depth approach makes it robust. Previous reviewer concerns about breaking `every` schedules have been addressed by using half-interval gaps. All 103 cron tests pass. - No files require special attention <sub>Last reviewed commit: fe9ee91</sub> <!-- greptile_other_comments_section --> <!-- /greptile_comment -->

Most Similar PRs