#9112: Fix: Prevent double SIGUSR1 restart on model switch
gateway
cli
stale
Cluster:
Gateway Restart Improvements
## Summary
Fixes #9097 - Double SIGUSR1 restart on model switch causes webchat message loss
## Problem
When switching models via the webchat UI (e.g. `/model opus`), the gateway receives **two SIGUSR1 signals** in rapid succession (~2 seconds apart), causing a double restart. This results in:
- Webchat disconnecting twice
- In-flight assistant responses being lost
- Poor user experience with "crashed out" appearance
## Root Cause
When `config.apply` or `config.patch` is called:
1. The config file is written to disk
2. `scheduleGatewaySigusr1Restart()` is called directly → **SIGUSR1 #1**
3. The chokidar file watcher detects the config file change
4. The file watcher triggers `onRestart()` → **SIGUSR1 #2**
Both paths increment the authorization count, so both restarts are authorized.
## Solution
Added a debounce mechanism to prevent duplicate restarts:
1. **Track restart timestamp** (`restart.ts`):
- Added `lastScheduledRestartTs` to track when a restart was last scheduled
- Added `isGatewaySigusr1RestartRecentlyScheduled()` to check if a restart was scheduled within the last 3 seconds
- Updated `scheduleGatewaySigusr1Restart()` to record the timestamp
2. **Skip file watcher restart if recent** (`config-reload.ts`):
- Added optional `isRestartRecentlyScheduled` callback parameter
- Check before triggering restart in both "restart" and "hybrid" modes
- Log "config reload skipped (restart already scheduled)" when skipping
3. **Wire up the check** (`server.impl.ts`):
- Pass `isGatewaySigusr1RestartRecentlyScheduled` to the config reloader
## Design Decisions
- **3-second debounce window**: Long enough to cover the typical delay between programmatic restart and file watcher detection, but short enough to not interfere with legitimate sequential config changes
- **Optional callback**: Keeps the config reloader decoupled from restart implementation details
- **Preserve file watcher functionality**: External config changes (manual edits) still trigger restarts normally
## Testing
Manual testing expected workflow:
1. Start webchat session
2. Switch model via `/model opus`
3. Gateway should restart **once** (not twice)
4. Webchat reconnects cleanly
5. No message loss
🤖 Generated with [Claude Code](https://claude.com/claude-code)
<!-- greptile_comment -->
<h2>Greptile Overview</h2>
<h3>Greptile Summary</h3>
This PR adds a debounce mechanism to avoid issuing two SIGUSR1-triggered gateway restarts when config changes are initiated programmatically (e.g., model switch via webchat) and then observed again by the chokidar config file watcher. It does this by recording the last time a SIGUSR1 restart was scheduled (`src/infra/restart.ts`), wiring a “recently scheduled” callback into the gateway config reloader (`src/gateway/config-reload.ts`), and passing that callback from the gateway server startup (`src/gateway/server.impl.ts`).
<h3>Confidence Score: 3/5</h3>
- This PR is close to safe to merge but contains a state bug that can suppress future restarts in some failure scenarios.
- Core idea is straightforward and changes are localized, but the new debounce path in the config reloader marks `restartQueued` even when it skips, which can permanently block subsequent restarts if the previously scheduled restart never occurs.
- src/gateway/config-reload.ts
<!-- greptile_other_comments_section -->
<!-- /greptile_comment -->
Most Similar PRs
#12953: fix: defer gateway restart until all replies are sent
by zoskebutler · 2026-02-10
81.2%
#13408: fix(gateway): skip SIGUSR1 restart in config.patch for noop reload ...
by rwmjhb · 2026-02-10
79.7%
#7128: feat: add gateway.restart RPC for graceful in-process restart
by AkashaBot · 2026-02-02
78.8%
#16170: fix: restart service manager after update.run
by Swader · 2026-02-14
78.3%
#11280: fix(gateway): add meta prefix to reload rules to prevent double SIG...
by cheenu1092-oss · 2026-02-07
77.1%
#5077: fix(windows): implement reliable gateway restart via schtasks helper
by romeoscript · 2026-01-31
76.1%
#20355: fix(gateway): enforce commands.restart guard for config.apply and c...
by Clawborn · 2026-02-18
75.5%
#3517: fix: trigger agent response for webchat sessions after restart
by dovewars · 2026-01-28
73.8%
#11746: fix: treat meta config paths as no-op to prevent unnecessary gatewa...
by QDenka · 2026-02-08
73.5%
#18254: add /update chat command for Telegram git updates
by dangmstaredu · 2026-02-16
73.5%