#8578: fix(cron): add failure limit and exponential backoff for isolated tasks
stale
Cluster:
Cron Enhancements and Fixes
Fixes #8520
## What's Fixed
Prevents infinite retry loops in cron scheduler for isolated tasks.
## Changes
- Added MAX_CONSECUTIVE_FAILURES (3 failures max)
- Track consecutive failures in CronJobState
- Auto-disable jobs after 3 consecutive failures
- Implement exponential backoff (1s, 2s, 4s retry delays)
## Testing
- All 5054 tests pass :white_check_mark:
- 6 new comprehensive test cases pass
## :robot: AI-Assisted Development
Built with Gemini 3 Pro, Claude 4.5 Haiku&Opus and OpenClaw. Code is fully tested and ready for production. Confirms understanding of implementation.
<!-- greptile_comment -->
<h2>Greptile Overview</h2>
<h3>Greptile Summary</h3>
This PR adds failure protection to the cron scheduler by introducing `CronJobState.consecutiveFailures`, enforcing a max consecutive error threshold (3) that auto-disables a job, and applying exponential backoff for isolated jobs on error. It also adds a focused vitest suite covering failure counting, disabling, and backoff behavior.
The changes are centered in `src/cron/service/timer.ts`’s `executeJob()` finish path, which now mutates job state based on `status` and adjusts `nextRunAtMs` scheduling decisions accordingly. `src/cron/types.ts` extends the persisted job state to track failures.
<h3>Confidence Score: 3/5</h3>
- This PR is likely safe to merge, but there are a couple of behavioral edge cases worth clarifying before landing.
- Core approach (failure counter + disable + backoff) is straightforward and well-covered by tests, but `skipped` handling can unintentionally carry failures forward, and removing the `finally`-block resync may affect long-running job scheduling/metadata. Neither is a compile-time issue, but both can cause surprising runtime behavior.
- src/cron/service/timer.ts
<!-- greptile_other_comments_section -->
<sub>(2/5) Greptile learns from your feedback when you react with thumbs up/down!</sub>
<!-- /greptile_comment -->
Most Similar PRs
#8825: fix: prevent cron infinite retry loop with exponential backoff
by dbottme · 2026-02-04
89.8%
#12303: fix(cron): correct nextRunAtMs calculation and prevent timer stall
by colddonkey · 2026-02-09
82.3%
#8698: fix(cron): default enabled to true for new jobs
by emmick4 · 2026-02-04
82.3%
#18144: fix(cron): clear stuck runningAtMs after timeout and add maintenanc...
by taw0002 · 2026-02-16
82.1%
#5428: fix(Cron): prevent one-shot loop on skip
by imshrishk · 2026-01-31
81.4%
#3693: fix(cron): delete deleteAfterRun jobs regardless of execution status
by HirokiKobayashi-R · 2026-01-29
81.3%
#5179: fix(cron): recover stale running markers
by thatdaveb · 2026-01-31
80.1%
#8418: fix: notify user after consecutive heartbeat/cron failures
by liaosvcaf · 2026-02-04
80.1%
#12086: fix(cron): ensure timer callback fires for scheduled jobs
by divol89 · 2026-02-08
80.0%
#12443: fix(cron): don't advance past-due jobs that haven't been executed
by rummangeminicode · 2026-02-09
80.0%