← Back to PRs

#6515: fix: in-process IPC for cron tool to avoid WS self-contention timeout

by amco3008 open 2026-02-01 18:58 View on GitHub →
gateway agents
Cluster: Cron Job Fixes
## Problem When the cron tool runs during an active agent session, it opens a **new WebSocket connection** to the gateway. Since the gateway's event loop is busy processing the current agent turn, the new WS handshake + request often can't complete within the 10s default timeout. The operations actually **succeed** (the gateway processes them after the timeout fires), but the tool reports a false timeout error. This causes agents to retry, creating **duplicate cron jobs**. ### Reproduction 1. Run an agent with `gateway.bind: 'tailnet'` (or any non-loopback binding) 2. During an active session, call the cron tool (add/list/etc) 3. Observe `gateway timeout after 10000ms` error 4. Check `clawdbot cron list` from CLI — the job was actually created ## Solution Register the `CronService` as a process-global singleton when the gateway starts. The cron tool checks for this in-process service first and calls it directly — bypassing the WebSocket entirely. **Falls back to WebSocket when:** - Running outside the gateway process (CLI usage) - Targeting a remote gateway (custom URL/token provided) ### Performance - **Before:** 10s+ timeout (false failure) via WebSocket self-connection - **After:** <1ms direct in-process call ## Files Changed - `src/gateway/in-process.ts` (NEW): Lightweight service registry - `src/agents/tools/cron-tool.ts`: In-process fast path for all 8 cron actions - `src/gateway/server.impl.ts`: Register on startup, re-register on reload, clear on shutdown ## Design Notes - Zero breaking changes — external CLI/remote usage is unaffected - The `in-process.ts` registry is extensible for other gateway tools (sessions, message, etc.) - Registration is maintained through config reloads - Cleaned up on shutdown to prevent stale references <!-- greptile_comment --> <h2>Greptile Overview</h2> <h3>Greptile Summary</h3> This PR adds an in-process service registry (`src/gateway/in-process.ts`) and wires it into gateway startup/reload/shutdown (`src/gateway/server.impl.ts`) so agent tools running inside the gateway process can call services directly instead of opening a self-referential WebSocket. `src/agents/tools/cron-tool.ts` is updated to use this in-process fast path for all cron actions (status/list/add/update/remove/run/runs/wake), falling back to the existing WebSocket-based `callGatewayTool` when a custom gateway URL/token is provided or when not running inside the gateway process. This addresses false 10s gateway timeouts and resulting duplicate cron job retries during active agent turns. <h3>Confidence Score: 4/5</h3> - This PR is largely safe to merge and should reduce cron-tool flakiness, with one behavioral bug to fix in the in-process wake path. - The change is localized (cron tool + simple in-process registry) and preserves the WebSocket fallback behavior; the main risk is the missing await in the new in-process `wake` path which can change return semantics compared to the existing WS implementation. - src/agents/tools/cron-tool.ts (in-process `wake` fast path) <!-- greptile_other_comments_section --> <sub>(2/5) Greptile learns from your feedback when you react with thumbs up/down!</sub> <!-- /greptile_comment -->

Most Similar PRs