#22143: Fix memory leak in WhatsApp channel reconnection loop
channel: whatsapp-web
size: XS
Cluster:
WhatsApp Connection Stability Fixes
## Summary
The openclaw-gateway (Node.js) grows unbounded during WhatsApp channel reconnection cycles — observed 2.3GB to 43.2GB in ~16 hours, consuming 67% of a 64GB box before OOMing with `FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory`.
Three leaks compound during reconnection:
**Leak 1: Debouncer never cleaned up on close** (`src/auto-reply/inbound-debounce.ts`)
- `createInboundDebouncer()` returns only `{ enqueue, flushKey }` — no teardown
- Internal `buffers` Map holds message items with pending `setTimeout` callbacks
- Each item captures `sock` WebSocket references via closures (`sendComposing`, `reply`, `sendMedia`)
- When `monitorWebInbox.close()` is called on reconnect, these timers and references persist
**Leak 2: groupMetaCache never cleared on close** (`src/web/inbound/monitor.ts`)
- `groupMetaCache = new Map()` accumulates group metadata entries
- Never cleared in the `close()` handler — expired-TTL entries still occupy memory
**Leak 3: Outer-scope Maps accumulate across reconnections** (`src/web/auto-reply/monitor.ts`)
- `groupHistories` and `groupMemberNames` Maps are created outside the `while(true)` reconnection loop
- They grow across every reconnection cycle and are never bounded or cleared
## Changes
- **`src/auto-reply/inbound-debounce.ts`**: Add `flushAll()` and `close()` methods to `createInboundDebouncer` return value. `close()` cancels all pending timers and drops buffer references.
- **`src/web/inbound/monitor.ts`**: Call `debouncer.close()` and `groupMetaCache.clear()` in the `monitorWebInbox` close handler before closing the WebSocket.
- **`src/web/auto-reply/monitor.ts`**: Clear `groupHistories` and `groupMemberNames` at the top of each reconnection loop iteration.
## Test plan
- [ ] Verify gateway starts without errors after applying changes
- [ ] Monitor `process.memoryUsage().heapUsed` across multiple reconnection cycles — should drop back after each reconnect
- [ ] Confirm no `FATAL ERROR: Reached heap limit` crashes over 24+ hours of operation
- [ ] Run existing test suite (`pnpm test`)
<!-- greptile_comment -->
<h3>Greptile Summary</h3>
Fixes three distinct memory leaks in WhatsApp channel reconnection that caused unbounded heap growth (2.3GB to 43.2GB in 16 hours):
- **Debouncer cleanup**: Added `close()` method to cancel pending timers and clear buffer references that captured WebSocket closures (`sendComposing`, `reply`, `sendMedia`)
- **Group metadata cache**: Clear `groupMetaCache` Map on connection close to release accumulated group metadata entries
- **Reconnection loop state**: Clear `groupHistories` and `groupMemberNames` Maps at the start of each reconnection cycle
All three fixes correctly address the root causes described in the PR. The debouncer's `close()` method appropriately drops buffered messages during reconnection rather than attempting to flush them with a stale WebSocket connection. The `flushAll()` method was added but isn't currently used in the close path - this appears intentional as flushing would fail with a closed socket.
<h3>Confidence Score: 4/5</h3>
- This PR is safe to merge with low risk - fixes critical memory leaks without introducing new behavior
- The fixes correctly address three specific memory leaks with surgical changes. The implementation is sound: timers are properly cancelled, Maps are cleared at appropriate lifecycle points, and the changes follow existing patterns. Score is 4 (not 5) because the `flushAll()` method is added but unused, and other channels using `createInboundDebouncer` may have similar leaks that aren't addressed in this PR.
- No files require special attention - all changes are straightforward cleanup operations
<sub>Last reviewed commit: 35b3681</sub>
<!-- greptile_other_comments_section -->
<sub>(4/5) You can add custom instructions or style guidelines for the agent [here](https://app.greptile.com/review/github)!</sub>
<!-- /greptile_comment -->
Most Similar PRs
#9727: fix(whatsapp): retry reconnect loop on initial connection failure
by luizlf · 2026-02-05
79.2%
#17487: fix: WhatsApp connection stability - continue reconnection after ma...
by MisterGuy420 · 2026-02-15
78.8%
#22480: fix: memory leak, silent WS failures, and connection error handling
by Chase-Xuu · 2026-02-21
78.6%
#22131: fix: clear seqByRun entries in clearAgentRunContext to prevent memo...
by alanwilhelm · 2026-02-20
77.8%
#6302: fix: Add timeouts to prevent indefinite hangs (issues #4954, #4956,...
by batumilove · 2026-02-01
77.1%
#16923: fix(web): resolve stale socket race condition in WhatsApp auto-reply
by dorukardahan · 2026-02-15
76.8%
#19303: Fix WhatsApp internal error leakage + cron.run timeout defaults
by koala73 · 2026-02-17
76.6%
#22469: fix(gateway): avoid stale whatsapp labels on direct sessions
by loganprit · 2026-02-21
76.1%
#22367: fix(whatsapp): prevent permanent listener loss after abort during r...
by mcinteerj · 2026-02-21
76.0%
#16767: fix: auto-resync webchat on reconnect and prevent message flicker o...
by alewcock · 2026-02-15
74.9%