#23290: fix(cron): use lastRunAtMs for next schedule of interval jobs after restart
size: XS
Cluster:
Cron Job Management Fixes
## Summary
- **Problem:** After a gateway restart, interval-based cron jobs (kind \`every\`) can show an unexpected "NEXT in" time. A 30-minute job that last ran 6 minutes ago may display "NEXT in 56m" instead of the expected ~24m.
- **Why it matters:** Users see confusing, non-obvious scheduling and may think their cron jobs are broken.
- **What changed:** In \`computeJobNextRunAtMs\` (src/cron/service/jobs.ts), when a job has a \`lastRunAtMs\` and \`lastRunAtMs + everyMs\` is still in the future, use that as the next run time instead of the anchor-based formula.
- **What did NOT change:** Anchor-based scheduling is still used as fallback when \`lastRunAtMs\` is not available or \`lastRunAtMs + everyMs\` is already in the past (e.g., long downtime catch-up). Cron-expression and one-shot (\`at\`) schedules are unchanged.
## Change Type (select all)
- [x] Bug fix
- [ ] Feature
- [ ] Refactor
- [ ] Docs
- [ ] Security hardening
- [ ] Chore/infra
## Scope (select all touched areas)
- [x] Gateway / orchestration
- [ ] Skills / tool execution
- [ ] Auth / tokens
- [ ] Memory / storage
- [ ] Integrations
- [ ] API / contracts
- [ ] UI / DX
- [ ] CI/CD / infra
## Linked Issue/PR
- Closes #22895
## User-visible / Behavior Changes
- After gateway restart, interval jobs show the intuitive "NEXT in" time: \`everyMs - timeSinceLastRun\`
- Example: 30-min job, last ran 6 min ago → "NEXT in 24m" (was showing ~56m)
## Security Impact (required)
- New permissions/capabilities? \`No\`
- Secrets/tokens handling changed? \`No\`
- New/changed network calls? \`No\`
- Command/tool execution surface changed? \`No\`
- Data access scope changed? \`No\`
## Repro + Verification
### Environment
- OS: macOS 15.3 (arm64)
- Runtime: Node v22+
- Integration/channel: Cron service
### Steps
1. Create a cron job with \`every: 30m\`
2. Let it run at least once
3. Restart the gateway
4. Check the dashboard — "NEXT in" should reflect \`interval - timeSinceLastRun\`
### Expected
- "NEXT in 24m" (if last run was 6 minutes ago)
### Actual
- Before fix: "NEXT in 56m" (anchor-based formula computes non-obvious grid alignment)
- After fix: "NEXT in 24m" (\`lastRunAtMs + everyMs - nowMs\`)
## Evidence
The fix adds a \`lastRunAtMs\`-based fast path before the anchor calculation:
\`\`\`typescript
if (typeof job.state.lastRunAtMs === "number" && Number.isFinite(job.state.lastRunAtMs)) {
const nextFromLast = job.state.lastRunAtMs + everyMs;
if (nextFromLast > nowMs) {
return nextFromLast;
}
}
// fallback to anchor-based formula
\`\`\`
This ensures the interval is always measured from the last actual execution, which matches user expectations for "every N minutes".
## Human Verification (required)
- Verified scenarios: Traced the scheduling flow through \`start()\` → \`runMissedJobs()\` → \`recomputeNextRuns()\`; confirmed \`lastRunAtMs\` is set by \`finishJob()\` in \`timer.ts\`
- Edge cases checked: \`lastRunAtMs\` undefined (first run) → falls back to anchor; \`lastRunAtMs + everyMs\` in the past (long downtime) → falls back to anchor; disabled jobs return \`undefined\`
- What I did **not** verify: Multi-day downtime catch-up behavior with many missed intervals
## Compatibility / Migration
- Backward compatible? \`Yes\` — anchor-based formula is preserved as fallback
- Config/env changes? \`No\`
- Migration needed? \`No\`
## Failure Recovery (if this breaks)
- How to disable/revert: Revert the \`lastRunAtMs\` check in \`jobs.ts\`
- Files/config to restore: \`src/cron/service/jobs.ts\`
- Known bad symptoms: If reverted, interval jobs may show non-intuitive "NEXT in" times after restart (existing behavior)
## Risks and Mitigations
- Risk: Cumulative timing drift — using \`lastRunAtMs + everyMs\` instead of fixed grid points means execution latency accumulates over many runs
- Mitigation: The anchor-based formula kicks in whenever \`lastRunAtMs + everyMs\` falls behind \`nowMs\`, naturally re-aligning to the grid
Made with [Cursor](https://cursor.com)
<!-- greptile_comment -->
<h3>Greptile Summary</h3>
Fixes interval-based cron job scheduling after gateway restart by prioritizing `lastRunAtMs` over anchor-based grid calculations.
- Adds fast-path in `computeJobNextRunAtMs` that calculates next run as `lastRunAtMs + everyMs` when available and still in future
- Falls back to anchor-based formula when `lastRunAtMs` is missing (first run) or when next run from last is already past (long downtime)
- Preserves backward compatibility by keeping anchor-based scheduling as fallback
- Addresses user confusion where 30-minute interval jobs showed unexpected "NEXT in" times after restart (e.g., "56m" instead of expected "24m")
- Implementation correctly handles edge cases: undefined `lastRunAtMs`, non-finite values, disabled jobs, and catch-up scenarios
<h3>Confidence Score: 5/5</h3>
- Safe to merge with minimal risk
- The fix is well-scoped to interval jobs, maintains backward compatibility through fallback logic, correctly handles all edge cases (undefined/non-finite values, first run, long downtime), and uses defensive programming practices (type checks, Number.isFinite, bounds checking). The fast-path optimization is sound and the anchor-based formula is preserved as a safety net.
- No files require special attention
<sub>Last reviewed commit: 9b034a9</sub>
<!-- greptile_other_comments_section -->
<!-- /greptile_comment -->
Most Similar PRs
#22948: fix(cron): every-schedule boundary returns nowMs instead of next sl...
by echoVic · 2026-02-21
89.7%
#7022: fix(cron): prevent schedule drift on gateway restart for 'every' jobs
by marciob · 2026-02-02
86.4%
#12747: fix: catch up missed cron-expression job runs on restart
by obin94-commits · 2026-02-09
85.9%
#22911: fix(cron): correct next execution time calculation after gateway re...
by anandsuraj · 2026-02-21
84.6%
#9060: Fix: Preserve scheduled cron jobs after gateway restart
by vishaltandale00 · 2026-02-04
83.7%
#8034: fix(cron): run past-due one-shot jobs immediately on startup
by FelixFoster · 2026-02-03
83.3%
#18925: fix(cron): stagger missed jobs on restart to prevent gateway overload
by rexlunae · 2026-02-17
82.1%
#13065: fix(cron): Fix "every" schedule not re-arming after gateway restart
by trevorgordon981 · 2026-02-10
81.2%
#16132: fix(cron): prevent duplicate job fires via MIN_REFIRE_GAP_MS guard
by widingmarcus-cyber · 2026-02-14
81.1%
#12443: fix(cron): don't advance past-due jobs that haven't been executed
by rummangeminicode · 2026-02-09
80.8%