← Back to PRs

#9112: Fix: Prevent double SIGUSR1 restart on model switch

by vishaltandale00 open 2026-02-04 21:49 View on GitHub →
gateway cli stale
## Summary Fixes #9097 - Double SIGUSR1 restart on model switch causes webchat message loss ## Problem When switching models via the webchat UI (e.g. `/model opus`), the gateway receives **two SIGUSR1 signals** in rapid succession (~2 seconds apart), causing a double restart. This results in: - Webchat disconnecting twice - In-flight assistant responses being lost - Poor user experience with "crashed out" appearance ## Root Cause When `config.apply` or `config.patch` is called: 1. The config file is written to disk 2. `scheduleGatewaySigusr1Restart()` is called directly → **SIGUSR1 #1** 3. The chokidar file watcher detects the config file change 4. The file watcher triggers `onRestart()` → **SIGUSR1 #2** Both paths increment the authorization count, so both restarts are authorized. ## Solution Added a debounce mechanism to prevent duplicate restarts: 1. **Track restart timestamp** (`restart.ts`): - Added `lastScheduledRestartTs` to track when a restart was last scheduled - Added `isGatewaySigusr1RestartRecentlyScheduled()` to check if a restart was scheduled within the last 3 seconds - Updated `scheduleGatewaySigusr1Restart()` to record the timestamp 2. **Skip file watcher restart if recent** (`config-reload.ts`): - Added optional `isRestartRecentlyScheduled` callback parameter - Check before triggering restart in both "restart" and "hybrid" modes - Log "config reload skipped (restart already scheduled)" when skipping 3. **Wire up the check** (`server.impl.ts`): - Pass `isGatewaySigusr1RestartRecentlyScheduled` to the config reloader ## Design Decisions - **3-second debounce window**: Long enough to cover the typical delay between programmatic restart and file watcher detection, but short enough to not interfere with legitimate sequential config changes - **Optional callback**: Keeps the config reloader decoupled from restart implementation details - **Preserve file watcher functionality**: External config changes (manual edits) still trigger restarts normally ## Testing Manual testing expected workflow: 1. Start webchat session 2. Switch model via `/model opus` 3. Gateway should restart **once** (not twice) 4. Webchat reconnects cleanly 5. No message loss 🤖 Generated with [Claude Code](https://claude.com/claude-code) <!-- greptile_comment --> <h2>Greptile Overview</h2> <h3>Greptile Summary</h3> This PR adds a debounce mechanism to avoid issuing two SIGUSR1-triggered gateway restarts when config changes are initiated programmatically (e.g., model switch via webchat) and then observed again by the chokidar config file watcher. It does this by recording the last time a SIGUSR1 restart was scheduled (`src/infra/restart.ts`), wiring a “recently scheduled” callback into the gateway config reloader (`src/gateway/config-reload.ts`), and passing that callback from the gateway server startup (`src/gateway/server.impl.ts`). <h3>Confidence Score: 3/5</h3> - This PR is close to safe to merge but contains a state bug that can suppress future restarts in some failure scenarios. - Core idea is straightforward and changes are localized, but the new debounce path in the config reloader marks `restartQueued` even when it skips, which can permanently block subsequent restarts if the previously scheduled restart never occurs. - src/gateway/config-reload.ts <!-- greptile_other_comments_section --> <!-- /greptile_comment -->

Most Similar PRs