#16880: fix(cron): respect per-job timeoutSeconds in executeJob path (#16841)
size: S
Cluster:
Cron Job Stability Fixes
## Summary
Fixes #16841
The `executeJob` function (used for missed-job recovery, `runDueJobs`, and manual/forced runs via `ops.ts`) called `executeJobCore` directly without any timeout wrapper. Unlike the `onTimer` batch path which correctly used `Promise.race` with `payload.timeoutSeconds`, jobs running through `executeJob` relied solely on the inner agent timeout from `resolveAgentTimeoutMs`, which defaults to `cfg.agents.defaults.timeoutSeconds` and ignores the per-job `payload.timeoutSeconds` override.
## Changes
Added the same `Promise.race` timeout wrapper to `executeJob`, reading `payload.timeoutSeconds` from the job config with `DEFAULT_JOB_TIMEOUT_MS` (10 min) as the fallback — matching the existing `onTimer` behavior.
## Testing
- Existing cron tests pass (`service.issue-regressions`, `service.rearm-timer-when-running`)
- Linter and formatter pass (oxlint + oxfmt)
<!-- greptile_comment -->
<h3>Greptile Summary</h3>
Adds `Promise.race` timeout wrapper to the `executeJob` function to respect per-job `timeoutSeconds` configuration. Previously, jobs executed via `executeJob` (used for missed-job recovery, manual runs, and forced execution) bypassed the job-level timeout and relied only on the inner agent timeout. This change mirrors the existing timeout logic in `onTimer` (lines 230-245), applying the same `jobTimeoutMs` calculation and `Promise.race` pattern with proper timeout cleanup.
<h3>Confidence Score: 5/5</h3>
- This PR is safe to merge with minimal risk
- The implementation exactly mirrors the proven timeout pattern from `onTimer` (lines 230-245), uses the same timeout calculation logic, includes proper cleanup via `.finally()`, and maintains consistency across both execution paths. The change is minimal, focused, and directly addresses the documented issue.
- No files require special attention
<sub>Last reviewed commit: d29fc2c</sub>
<!-- greptile_other_comments_section -->
<!-- /greptile_comment -->
Most Similar PRs
#19414: fix: respect job timeoutSeconds for stuck runningAtMs detection
by namabile · 2026-02-17
84.7%
#18144: fix(cron): clear stuck runningAtMs after timeout and add maintenanc...
by taw0002 · 2026-02-16
84.7%
#12018: fix(cron): clear stale running markers based on job timeout
by benzer25 · 2026-02-08
79.8%
#16888: fix(cron): execute missed jobs outside the lock to unblock list/sta...
by hou-rong · 2026-02-15
79.6%
#17561: fix(cron): add runtime staleness guard for runningAtMs (#17554)
by robbyczgw-cla · 2026-02-15
79.4%
#16132: fix(cron): prevent duplicate job fires via MIN_REFIRE_GAP_MS guard
by widingmarcus-cyber · 2026-02-14
79.4%
#17064: fix(cron): prevent control-plane starvation during startup catch-up...
by donggyu9208 · 2026-02-15
79.2%
#18960: fix: don't disable one-shot cron jobs on skipped status
by jwchmodx · 2026-02-17
79.1%
#17664: fix(cron): detect and clear stale runningAtMs marker in manual run ...
by echoVic · 2026-02-16
78.6%
#17895: fix(cron): add staleness check for runningAtMs on manual trigger
by PlayerGhost · 2026-02-16
78.5%