#21054: fix(cli): fix memory search hang — close undici pool + destroy QMD stdio on timeout
cli
size: M
Cluster:
Error Handling and Memory Management
## Problem
\`openclaw memory search\` (and any CLI command that calls a remote embedding provider) hangs indefinitely after completing — the process never exits without \`Ctrl-C\`. Two independent Node.js event loop leaks keep the process alive.
---
## Root Cause Analysis
### Leak 1 — undici HTTP connection pool
Remote embedding calls (OpenAI, Gemini, Voyage, etc.) go through undici's global dispatcher, which maintains keep-alive TLS connections. After a response is received, those Sockets stay open waiting for connection reuse. Nothing in the CLI shutdown path closed them, so the event loop could never drain.
### Leak 2 — QMD child stdio pipes (the subtle one)
When a \`qmd\` subprocess hangs and is killed with \`SIGKILL\` on timeout, the naive fix is to call \`child.unref()\`. **This is insufficient.** Here is why:
\`unref()\` only removes the \`ChildProcess\` object from the event loop's ref count. But a spawned child has three *independent* event loop handles — the \`stdout\`, \`stderr\`, and \`stdin\` Socket objects (the parent-side ends of the stdio pipes). These are not unref'd by \`child.unref()\`.
The hang scenario:
```
Node.js (parent) qmd (child, SIGKILL'd) grandchild (still alive)
stdout pipe read-end ←── stdout pipe write-end ←── inherited write-end (open)
```
\`qmd\` is an ML tool that can spawn inference worker subprocesses. \`SIGKILL\` only kills the direct child — not its descendants. If a grandchild inherited the stdio file descriptors and holds the write-end open, Node.js never receives EOF on the read-end. The Socket stays ref'd, and the event loop hangs regardless of \`unref()\`.
\`child.unref()\` accidentally "worked" in the common case where \`qmd\` had no grandchildren or all descendants were killed together — but failed silently in the grandchild scenario.
### Coverage gap — \`tryRouteCli\` early-return path
The initial fix wrapped only \`program.parseAsync\` in a \`try/finally\`. However, \`models status --probe\` is handled inside \`tryRouteCli\` (which has an early \`return\`) and makes HTTP requests through undici. That path bypassed the dispatcher cleanup entirely.
---
## Fix
### Leak 1
\`closeGlobalFetchDispatcher()\` drains the undici connection pool. Wrapped in a single \`try/finally\` that covers **both** \`tryRouteCli\` and \`program.parseAsync\`, so no CLI exit path is missed. Uses \`dynamic import\` + \`try/catch\` so it is safe when undici is absent.
### Leak 2
Replace \`child.unref()\` with explicit pipe destruction in the timeout handler:
```ts
child.stdout?.destroy();
child.stderr?.destroy();
child.stdin?.destroy();
```
\`destroy()\` closes the parent-side file descriptors immediately and unconditionally — it does not matter whether the child, its grandchildren, or any other process still holds the write-end open. Once the parent closes its end, those Socket handles are removed from the event loop.
---
## What Changed
| File | Change |
|---|---|
| \`src/cli/run-main.ts\` | Single \`try/finally\` covering both \`tryRouteCli\` + \`parseAsync\`; \`closeGlobalFetchDispatcher()\` exported |
| \`src/memory/qmd-manager.ts\` | \`child.unref()\` → \`destroy()\` on all three stdio streams in the timeout handler |
| \`src/cli/run-main-dispatcher.test.ts\` | New file — isolated undici mock + 2 tests for \`closeGlobalFetchDispatcher\` |
| \`src/cli/run-main.test.ts\` | Restored to pure argv-parsing tests; undici mock extracted to the new file above |
| \`src/memory/qmd-manager.test.ts\` | \`MockStream\` type with per-stream \`destroyCalled\` tracking replaces the previous \`unref\`-based mock infrastructure; 1 new test verifying all three streams are destroyed on timeout |
**Production code delta: 4 lines changed across 2 files.** No behavioral changes outside the event-loop cleanup paths.
---
## Change Type
- [x] Bug fix
## Scope
- [x] Memory / storage
- [x] UI / DX
## Linked Issue
- Closes #21018
## User-visible Change
\`openclaw memory search\` (and any CLI command using a remote embedding provider) now exits cleanly after completing, without requiring \`Ctrl-C\`.
## Security Impact
- New permissions/capabilities? No
- Secrets/tokens handling changed? No
- New/changed network calls? No
- Command/tool execution surface changed? No
- Data access scope changed? No
## Verification
- \`pnpm build\` — clean (TypeScript + lint + format)
- 54 tests pass across all affected files
- All CI checks green
- Backward compatible, no config/env changes
Most Similar PRs
#23247: fix(cli): removes --query from memory cmd options
by stuhorsman · 2026-02-22
73.3%
#19807: fix: apply #19779 Docker/TS strict-build fixes
by dalefrieswthat · 2026-02-18
73.0%
#17237: fix(update): guard post-install imports after npm global update
by tdjackey · 2026-02-15
72.3%
#17770: refactor(cli): reuse shared option builders
by iyoda · 2026-02-16
72.2%
#19391: fix(process): destroy stdio streams on dispose and terminate childr...
by nabbilkhan · 2026-02-17
72.2%
#9381: Fix: Allow QMD CLI memory search when scope is restrictive
by vishaltandale00 · 2026-02-05
72.1%
#20415: fix(extensions): use dist/ import paths for bundled extensions
by 88plug · 2026-02-18
72.1%
#23308: fix(browser): accept upload paths that traverse symlinked tmp dirs
by SidQin-cyber · 2026-02-22
72.0%
#18756: fix the memory manager class hierarchy declared at the wrong level
by leoh · 2026-02-17
71.6%
#22475: fix(logging): correct levelToMinLevel mapping to match tslog numbering
by ronaldslc · 2026-02-21
71.5%