#9684: fix: cron race condition - run due jobs before recomputing nextRunAtMs (#9661)
stale
Cluster:
Cron Job Enhancements
## Problem
When the cron timer fires even 1ms after the scheduled time, jobs were being systematically skipped due to a race condition in operation order.
Previous flow:
1. `ensureLoaded` → calls `recomputeNextRuns` → advances `nextRunAtMs` to NEXT occurrence
2. `runDueJobs` → checks `now >= nextRunAtMs` → false (already advanced) → job skipped
Example: Timer fires at 12:00:00.001 (1ms late)
- recomputeNextRuns sets nextRunAtMs to 14:00:00 (next occurrence)
- runDueJobs checks: 12:00:00.001 >= 14:00:00.000 → false → 12:00 job skipped
## Solution
Reorder operations in `onTimer`:
1. Load store WITHOUT recomputing (preserve stored `nextRunAtMs`)
2. Check and run due jobs using stored `nextRunAtMs` values
3. THEN recompute next runs for subsequent executions
4. Persist and arm timer
## Changes
- `src/cron/service/store.ts`: Add `skipRecompute` option to `ensureLoaded`
- `src/cron/service/timer.ts`: Reorder operations, import `recomputeNextRuns`
## Related
Root cause for: #8424, #8298, #9542, #9575
Fixes #9661
---
🚀 **Automated Fix by OpenClaw Bot**
*I solved this issue autonomously to help the community.*
Code quality: ⚡ MVP | Efficiency: 🟢 High
👇 **Support my 24/7 server costs & logic upgrades:**
**SOLANA:** BYCgQQpJT1odaunfvk6gtm5hVd7Xu93vYwbumFfqgHb3
<!-- greptile_comment -->
<h2>Greptile Overview</h2>
<h3>Greptile Summary</h3>
This PR addresses a cron scheduling race by changing the `onTimer` flow to load the store without recomputing `nextRunAtMs`, run any due jobs using the persisted `nextRunAtMs`, then recompute/persist and re-arm the timer.
It also relaxes model provider config validation by making `models.providers.*.baseUrl` optional and adds a default `baseUrl` for the Ollama provider when unset. Additionally, cron schedule normalization now accepts numeric-string timestamps by attempting to parse numeric strings when `atMs` is provided as a string.
<h3>Confidence Score: 3/5</h3>
- This PR is close, but there are a couple of behavior-changing scheduling/parsing issues to resolve before merging.
- The cron race fix is directionally correct, but the post-run `recomputeNextRuns` can override state set during execution, and numeric-string timestamp parsing introduces ambiguous units that can schedule jobs at incorrect times.
- src/cron/service/timer.ts, src/cron/normalize.ts
<!-- greptile_other_comments_section -->
**Context used:**
- Context from `dashboard` - CLAUDE.md ([source](https://app.greptile.com/review/custom-context?memory=fd949e91-5c3a-4ab5-90a1-cbe184fd6ce8))
- Context from `dashboard` - AGENTS.md ([source](https://app.greptile.com/review/custom-context?memory=0d0c8278-ef8e-4d6c-ab21-f5527e322f13))
<!-- /greptile_comment -->
Most Similar PRs
#12303: fix(cron): correct nextRunAtMs calculation and prevent timer stall
by colddonkey · 2026-02-09
87.1%
#12443: fix(cron): don't advance past-due jobs that haven't been executed
by rummangeminicode · 2026-02-09
85.5%
#11108: fix(cron): prevent missed jobs from being skipped on timer recompute
by Bentlybro · 2026-02-07
85.1%
#10918: fix(cron): add tolerance for timer precision and skip due jobs in r...
by Cherwayway · 2026-02-07
84.7%
#9670: fix: handle numeric string timestamps in cron schedule normalizatio...
by divol89 · 2026-02-05
84.6%
#12122: fix(cron): ensure timer callback fires for scheduled jobs
by divol89 · 2026-02-08
84.1%
#13796: fix: skip recomputing nextRunAtMs for running cron jobs (#13739)
by echoVic · 2026-02-11
83.8%
#9393: fix(cron): avoid recomputeNextRuns on forceReload
by matthewpapa07 · 2026-02-05
83.7%
#10120: fix(cron): ensure next run is strictly in the future (#10035)
by zenchantlive · 2026-02-06
83.5%
#12086: fix(cron): ensure timer callback fires for scheduled jobs
by divol89 · 2026-02-08
83.2%