#13772: feat: memory monitor, chat buffer leak fix, zombie exec session reaping
app: web-ui
gateway
agents
stale
After running for ~13 hours the gateway process hit 1.9GB RSS, causing V8 GC thrashing and 70% CPU. Dug into it and found a few things leaking:
1. `chatRunBuffers` and `chatDeltaSentAt` entries were never cleaned up after runs complete
2. Aborted run metadata hung around for 60 minutes (way too long)
3. Exec sessions marked `exited` were never actually removed from the registry
To fix these:
- Added `.finally()` cleanup in chat.ts to delete buffer entries per `clientRunId`
- Reduced aborted run TTL from 60min to 10min
- Added `pruneFinishedSessions()` to reap zombie exec sessions past TTL
Also added a memory monitor (`server-memory-monitor.ts`) that checks RSS every 60s, logs warnings at configurable thresholds, and triggers a graceful SIGUSR1 restart if memory hits critical. Thresholds default to 75%/85% of system memory with reasonable min clamps. Config is optional via `gateway.memory` (warnMB/criticalMB).
12 new tests for the monitor + threshold logic. All 277 gateway tests pass.
Related: #13758
Most Similar PRs
#8713: feat: gateway memory monitor, install linger, docs and failover
by quratus · 2026-02-04
70.7%
#16125: feat(gateway): add stuck session detection
by CyberSinister · 2026-02-14
69.6%
#22480: fix: memory leak, silent WS failures, and connection error handling
by Chase-Xuu · 2026-02-21
68.2%
#22131: fix: clear seqByRun entries in clearAgentRunContext to prevent memo...
by alanwilhelm · 2026-02-20
67.6%
#21944: feat(gateway): crash-loop protection with escalating backoff
by Protocol-zero-0 · 2026-02-20
64.3%
#22143: Fix memory leak in WhatsApp channel reconnection loop
by lancejames221b · 2026-02-20
64.3%
#17823: fix: memory leak in cron isolated runs — agent-events Maps never cl...
by techboss · 2026-02-16
62.7%
#10273: fix(agents): detect and auto-compact mid-run context overflow
by terryops · 2026-02-06
62.6%
#14993: fix(webchat): add heartbeat detection to prevent zombie WebSocket c...
by BenediktSchackenberg · 2026-02-12
62.2%
#16196: fix(gateway): add periodic cleanup to prevent memory leak in ToolEv...
by bianbiandashen · 2026-02-14
62.1%