#6515: fix: in-process IPC for cron tool to avoid WS self-contention timeout
gateway
agents
Cluster:
Cron Job Fixes
## Problem
When the cron tool runs during an active agent session, it opens a **new WebSocket connection** to the gateway. Since the gateway's event loop is busy processing the current agent turn, the new WS handshake + request often can't complete within the 10s default timeout.
The operations actually **succeed** (the gateway processes them after the timeout fires), but the tool reports a false timeout error. This causes agents to retry, creating **duplicate cron jobs**.
### Reproduction
1. Run an agent with `gateway.bind: 'tailnet'` (or any non-loopback binding)
2. During an active session, call the cron tool (add/list/etc)
3. Observe `gateway timeout after 10000ms` error
4. Check `clawdbot cron list` from CLI — the job was actually created
## Solution
Register the `CronService` as a process-global singleton when the gateway starts. The cron tool checks for this in-process service first and calls it directly — bypassing the WebSocket entirely.
**Falls back to WebSocket when:**
- Running outside the gateway process (CLI usage)
- Targeting a remote gateway (custom URL/token provided)
### Performance
- **Before:** 10s+ timeout (false failure) via WebSocket self-connection
- **After:** <1ms direct in-process call
## Files Changed
- `src/gateway/in-process.ts` (NEW): Lightweight service registry
- `src/agents/tools/cron-tool.ts`: In-process fast path for all 8 cron actions
- `src/gateway/server.impl.ts`: Register on startup, re-register on reload, clear on shutdown
## Design Notes
- Zero breaking changes — external CLI/remote usage is unaffected
- The `in-process.ts` registry is extensible for other gateway tools (sessions, message, etc.)
- Registration is maintained through config reloads
- Cleaned up on shutdown to prevent stale references
<!-- greptile_comment -->
<h2>Greptile Overview</h2>
<h3>Greptile Summary</h3>
This PR adds an in-process service registry (`src/gateway/in-process.ts`) and wires it into gateway startup/reload/shutdown (`src/gateway/server.impl.ts`) so agent tools running inside the gateway process can call services directly instead of opening a self-referential WebSocket.
`src/agents/tools/cron-tool.ts` is updated to use this in-process fast path for all cron actions (status/list/add/update/remove/run/runs/wake), falling back to the existing WebSocket-based `callGatewayTool` when a custom gateway URL/token is provided or when not running inside the gateway process. This addresses false 10s gateway timeouts and resulting duplicate cron job retries during active agent turns.
<h3>Confidence Score: 4/5</h3>
- This PR is largely safe to merge and should reduce cron-tool flakiness, with one behavioral bug to fix in the in-process wake path.
- The change is localized (cron tool + simple in-process registry) and preserves the WebSocket fallback behavior; the main risk is the missing await in the new in-process `wake` path which can change return semantics compared to the existing WS implementation.
- src/agents/tools/cron-tool.ts (in-process `wake` fast path)
<!-- greptile_other_comments_section -->
<sub>(2/5) Greptile learns from your feedback when you react with thumbs up/down!</sub>
<!-- /greptile_comment -->
Most Similar PRs
#8698: fix(cron): default enabled to true for new jobs
by emmick4 · 2026-02-04
78.7%
#8034: fix(cron): run past-due one-shot jobs immediately on startup
by FelixFoster · 2026-02-03
78.0%
#13055: fix: prevent cron RPC stalls with timeout and caching (#13018)
by trevorgordon981 · 2026-02-10
77.7%
#8307: fix(cron): improve tool description with reliable reminder guidance
by vishaltandale00 · 2026-02-03
76.6%
#10829: fix: prevent cron scheduler permanent death on transient startup/ru...
by meaadore1221-afk · 2026-02-07
76.3%
#16888: fix(cron): execute missed jobs outside the lock to unblock list/sta...
by hou-rong · 2026-02-15
76.1%
#11816: fix(cron): forward agent-specific exec config to isolated cron sess...
by AnonO6 · 2026-02-08
76.1%
#8744: fix(cron): load persisted cron jobs on gateway startup
by revenuestack · 2026-02-04
76.0%
#13065: fix(cron): Fix "every" schedule not re-arming after gateway restart
by trevorgordon981 · 2026-02-10
75.9%
#6466: fix(gateway): add handshake timeout and connection error handling
by jarvis-raven · 2026-02-01
75.8%