← Back to PRs

#13772: feat: memory monitor, chat buffer leak fix, zombie exec session reaping

by niceysam open 2026-02-11 01:09 View on GitHub →
app: web-ui gateway agents stale
After running for ~13 hours the gateway process hit 1.9GB RSS, causing V8 GC thrashing and 70% CPU. Dug into it and found a few things leaking: 1. `chatRunBuffers` and `chatDeltaSentAt` entries were never cleaned up after runs complete 2. Aborted run metadata hung around for 60 minutes (way too long) 3. Exec sessions marked `exited` were never actually removed from the registry To fix these: - Added `.finally()` cleanup in chat.ts to delete buffer entries per `clientRunId` - Reduced aborted run TTL from 60min to 10min - Added `pruneFinishedSessions()` to reap zombie exec sessions past TTL Also added a memory monitor (`server-memory-monitor.ts`) that checks RSS every 60s, logs warnings at configurable thresholds, and triggers a graceful SIGUSR1 restart if memory hits critical. Thresholds default to 75%/85% of system memory with reasonable min clamps. Config is optional via `gateway.memory` (warnMB/criticalMB). 12 new tests for the monitor + threshold logic. All 277 gateway tests pass. Related: #13758

Most Similar PRs