#22131: fix: clear seqByRun entries in clearAgentRunContext to prevent memory leak
size: XS
Cluster:
Memory Leak Fixes and Cleanup
## Summary
- `seqByRun` Map in `src/infra/agent-events.ts` grows unbounded because `clearAgentRunContext()` only deletes from `runContextById` but never from `seqByRun`
- Every agent run adds an entry via `emitAgentEvent()` that is never cleaned up, causing the gateway process to leak memory indefinitely
- Observed 65GB RSS over ~12 hours of active Discord/Slack messaging, with `DiscordMessageListener` taking 120-946 seconds per `MESSAGE_CREATE` due to GC pressure
- Fix: add `seqByRun.delete(runId)` to `clearAgentRunContext()`
## Test plan
- [x] Verified all 3 call sites (`server-chat.ts` x2, `commands/agent.ts`) only invoke `clearAgentRunContext` at run end/error — no risk of deleting a counter mid-run
- [x] Confirmed existing tests pass (`agent-events.test.ts`)
- [x] Deployed patched build to production gateway; memory stable at ~600MB after 15+ minutes vs 1.1GB+ and climbing on unpatched version
- [x] Reviewed by OpenAI Codex — no correctness or maintainability issues found
<!-- greptile_comment -->
<h3>Greptile Summary</h3>
Fixed critical memory leak where `seqByRun` Map grew unbounded because `clearAgentRunContext()` only deleted from `runContextById`
**Key changes:**
- Added `seqByRun.delete(runId)` to `clearAgentRunContext()` to properly clean up sequence counters when agent runs complete
- Production deployment confirmed memory stabilized at ~600MB vs 1.1GB+ and climbing on unpatched version
**Issue found:**
- `resetAgentRunContextForTest()` should also clear `seqByRun` to prevent test pollution between test runs
<h3>Confidence Score: 4/5</h3>
- This PR is safe to merge - it fixes a critical memory leak with a simple, targeted change
- The fix correctly addresses the memory leak by ensuring `seqByRun` entries are deleted alongside `runContextById` entries. All three call sites only invoke `clearAgentRunContext()` at run end/error, so there's no risk of mid-run deletion. Production deployment confirms the fix works. Score is 4 instead of 5 because `resetAgentRunContextForTest()` should also clear `seqByRun` for complete test isolation.
- src/infra/agent-events.ts needs a minor update to `resetAgentRunContextForTest()`
<sub>Last reviewed commit: bb9f387</sub>
<!-- greptile_other_comments_section -->
<sub>(2/5) Greptile learns from your feedback when you react with thumbs up/down!</sub>
<!-- /greptile_comment -->
Most Similar PRs
#22480: fix: memory leak, silent WS failures, and connection error handling
by Chase-Xuu · 2026-02-21
87.5%
#10281: fix(infra): clear seqByRun entry when agent run context is cleared
by programming-pupil · 2026-02-06
87.1%
#17823: fix: memory leak in cron isolated runs — agent-events Maps never cl...
by techboss · 2026-02-16
86.7%
#22143: Fix memory leak in WhatsApp channel reconnection loop
by lancejames221b · 2026-02-20
77.8%
#2541: fix(agents): add error handling to orphaned message cleanup
by Episkey-G · 2026-01-27
77.6%
#19328: Fix: preserve modelOverride in agent handler (#5369)
by CodeReclaimers · 2026-02-17
77.2%
#15945: fix(memory-flush): only write memoryFlushCompactionCount when compa...
by aldoeliacim · 2026-02-14
76.8%
#10636: fix: setTimeout integer overflow causing server crash
by devmangel · 2026-02-06
76.5%
#18029: infra: fix memory leak and error handling in event listeners
by MAhmadUzair · 2026-02-16
76.3%
#18432: fix(agents): clear active run state immediately on embedded timeout
by BinHPdev · 2026-02-16
76.2%