#17910: feat(memory): QMD daemon mode — persistent process with idle lifecycle
docs
agents
size: L
Cluster:
Error Handling and Memory Management
## Summary
- Problem: spawn-per-query QMD repeatedly pays cold-start cost and can miss interactive latency targets.
- Why it matters: users who opt into QMD want higher-quality retrieval without repeated startup penalties.
- What changed: this PR adds **optional** warm QMD daemon mode using QMD's HTTP MCP transport (`qmd mcp --http --daemon`) with OpenClaw-owned lifecycle (lazy start, health check, idle stop, shutdown stop).
- What changed: warm-path failures/timeouts automatically fall back to existing spawn-per-query behavior for the same query.
- What changed: docs/config were updated for npm install (`@tobilu/qmd@1.0.6`) and daemon port config.
- What did NOT change: default behavior is still non-daemon unless explicitly enabled.
## Change Type (select all)
- [ ] Bug fix
- [x] Feature
- [ ] Refactor
- [x] Docs
- [ ] Security hardening
- [ ] Chore/infra
## Scope (select all touched areas)
- [x] Gateway / orchestration
- [x] Skills / tool execution
- [ ] Auth / tokens
- [x] Memory / storage
- [ ] Integrations
- [ ] API / contracts
- [ ] UI / DX
- [ ] CI/CD / infra
## Linked Issue/PR
- Closes #
- Related #9048, #9581, #15579, #9605, #16047
## Related Approaches
| Approach | Strengths | Limitations | Why this PR exists |
|---|---|---|---|
| Spawn-per-query QMD baseline | Simple isolation per call | Repeated cold starts | Keep as fallback/default path |
| Minimal MCP stdio wrapper | Smaller integration surface | Less lifecycle control/hardening | Add explicit lifecycle + fallback semantics |
| Builtin backend | No QMD runtime dependency | Different retrieval profile than QMD | Keep available as separate backend |
| This PR (optional warm daemon over HTTP MCP) | Warm reuse, idle lifecycle, fallback safety, explicit port control | Native model/runtime still may timeout/crash | Improve QMD reliability/latency while preserving safe fallback |
Out of scope in this PR:
- Retrieval ranking policy changes (for example source weighting)
- Agent-side query-budget policy
- Long-duration soak tuning beyond current timeout + fallback controls
## User-visible / Behavior Changes
- New optional daemon mode under `memory.qmd.daemon.*` (still opt-in via `enabled`).
- New config key: `memory.qmd.daemon.port` (default `18790`).
- Warm daemon transport is HTTP MCP on loopback; runtime supports loopback endpoint compatibility (`127.0.0.1` and `::1`).
- Existing spawn-per-query path remains intact and is used on warm-path failures/timeouts.
- Docs now standardize QMD install as `npm i -g @tobilu/qmd@1.0.6`.
## Security Impact (required)
- New permissions/capabilities? (`No`)
- Secrets/tokens handling changed? (`No`)
- New/changed network calls? (`Yes`)
- Command/tool execution surface changed? (`Yes`)
- Data access scope changed? (`No`)
- If any `Yes`, explain risk + mitigation:
- Network calls are loopback-only to local QMD daemon endpoint.
- Daemon lifecycle commands are restricted to configured local `qmd` binary and existing memory scope.
- On errors, manager falls back to existing spawn-per-query path instead of failing closed.
## Repro + Verification
### Environment
- OS: macOS (Apple Silicon)
- Runtime/container: Node 22.x, pnpm
- Model/provider: QMD local model (`@tobilu/qmd`)
- Integration/channel (if any): dev gateway profile
- Relevant config (redacted): `memory.backend=qmd`, `memory.qmd.daemon.enabled=true`
### Steps
1. Install/verify QMD: `qmd --version` (validated on `1.0.6`).
2. Start dev gateway from branch and trigger `memory_search` calls.
3. Observe daemon lazy start + query behavior; verify fallback path remains functional on failures.
4. Verify `memory_get` compatibility with returned memory paths.
### Expected
- Daemon starts lazily on first warm-path query and remains warm until idle timeout/shutdown.
- Warm failures do not break memory search; query falls back to spawn-per-query.
- `memory_get` remains compatible with returned paths.
### Actual
- Verified locally with tests + dev gateway runs.
- Confirmed daemon startup log in successful warm run and preserved fallback behavior when daemon ownership conflicted.
## Evidence
Attach at least one:
- [x] Failing test/log before + passing after
- [x] Trace/log snippets
- [ ] Screenshot/recording
- [x] Perf numbers (if relevant)
### A) Cold warm-path (initial lazy load)
Goal: show first-query daemon startup cost plus successful warm-path completion.
Command used:
```bash
node dist/entry.js --profile dev agent --to +15555550123 --message "What do you know about BillSplitPro from memory notes?" --json --timeout 120 > /tmp/qmd-postfix4.json && jq '{status,summary,duration_ms:.result.meta.durationMs}' /tmp/qmd-postfix4.json
```
Output:
```json
{
"status": "ok",
"summary": "completed",
"duration_ms": 28200
}
```
Log snippet:
```text
2026-02-17T11:50:10.035Z embedded run tool start ... tool=memory_search
2026-02-17T11:50:10.567Z qmd daemon started (http://[::1]:18790/mcp)
2026-02-17T11:50:23.055Z embedded run tool end ... tool=memory_search
tool delta: ~13.0s
```
### B) Hot warm-path (daemon already loaded)
Goal: show query latency when daemon is already warm and no startup is required.
Command used:
```bash
node dist/entry.js --profile dev agent --to +15555550123 --message "Use memory_search exactly once for 'BillSplitPro receipt OCR', then reply with one sentence summary and citation." --json --timeout 120 > /tmp/qmd-postfix7.json && jq '{status,summary,duration_ms:.result.meta.durationMs}' /tmp/qmd-postfix7.json
```
Output:
```json
{
"status": "ok",
"summary": "completed",
"duration_ms": 7810
}
```
Log snippet:
```text
2026-02-17T12:19:26.219Z embedded run tool start ... tool=memory_search
2026-02-17T12:19:28.574Z embedded run tool end ... tool=memory_search
(no qmd daemon started line between start/end)
tool delta: ~2.36s
```
### C) Fallback path (intentional resilience case)
Goal: show query still succeeds when warm-path daemon ownership is conflicted.
Injected condition: pre-existing daemon ownership conflict (stale process on daemon port).
```text
2026-02-17T11:48:20.787Z qmd daemon search failed, falling back to spawn-per-query: Already running (PID 13434). Run qmd mcp stop first.
2026-02-17T11:48:20.536Z embedded run tool start ... tool=memory_search
2026-02-17T11:48:46.643Z embedded run tool end ... tool=memory_search
tool delta: ~26.1s
```
Interpretation:
- Cold warm-path includes one-time startup overhead.
- Hot warm-path is materially faster once daemon is already loaded.
- Fallback is slower but preserves successful query completion under daemon failure/conflict.
Note: full `duration_ms` includes model response generation + non-memory work. Tool-level timing above isolates memory tool latency.
Timing decomposition (from log timestamps):
| Case | Full run `duration_ms` | Pre-memory (run start → memory_search start) | Memory tool time (memory_search start → end) | Post-memory (memory_search end → prompt end) |
|---|---:|---:|---:|---:|
| Cold warm-path (lazy start) | 28.2s | ~2.6s | ~13.0s | ~12.6s |
| Hot warm-path (already warm) | 7.8s | ~2.7s | ~2.36s | ~2.72s |
| Fallback path | 37.9s | ~3.2s | ~26.1s | ~8.6s |
Interpretation: this separates agent/model orchestration time from QMD memory-tool time; warm daemon benefit is visible in the memory-tool segment.
## Human Verification (required)
What you personally verified (not just CI), and how:
- Verified scenarios:
- `pnpm check` on touched files
- `pnpm vitest run src/memory/qmd-manager.test.ts`
- `pnpm vitest run --config vitest.e2e.config.ts src/agents/tools/memory-tool.e2e.test.ts`
- `pnpm build`
- Dev gateway startup + memory query runs
- Edge cases checked:
- QMD 1.0.6 HTTP MCP header/session compatibility
- Loopback endpoint compatibility (`127.0.0.1` vs `::1`)
- Daemon conflict/fallback behavior
- What you did **not** verify:
- Multi-hour soak run
## Compatibility / Migration
- Backward compatible? (`Yes`)
- Config/env changes? (`Yes`)
- Migration needed? (`No`)
- If yes, exact upgrade steps:
- Optional: set `memory.qmd.daemon.enabled=true`
- Optional: set `memory.qmd.daemon.port` if `18790` conflicts
- Ensure local QMD version supports HTTP daemon workflow (documented as `@tobilu/qmd@1.0.6`)
## Failure Recovery (if this breaks)
- How to disable/revert this change quickly:
- Set `memory.qmd.daemon.enabled=false` (returns to spawn-per-query QMD)
- Or set `memory.backend=builtin`
- Files/config to restore:
- `src/memory/qmd-daemon.ts`
- `src/memory/qmd-manager.ts`
- `src/memory/backend-config.ts`
- `src/config/types.memory.ts`
- `src/config/zod-schema.ts`
- `src/config/schema.help.ts`
- `src/config/schema.labels.ts`
- Known bad symptoms reviewers should watch for:
- Repeated daemon-start conflict warnings
- Persistent warm timeouts (should still fall back and return results)
## Risks and Mitigations
- Risk: HTTP daemon contract differences across QMD versions.
- Mitigation: docs pin to npm `@tobilu/qmd@1.0.6`; fallback path remains active.
- Risk: Stale external daemon process can conflict with owned daemon start.
- Mitigation: explicit stop on idle/shutdown, configurable port, preserved fallback.
- Risk: Local model/runtime instability.
- Mitigation: health checks + timeout-based fallback to spawn-per-query.
## Notes
- Implemented and validated with Patrick Shao using OpenClaw, Claude Code Opus 4.6, and Codex.
Most Similar PRs
#21054: fix(cli): fix memory search hang — close undici pool + destroy QMD ...
by BinHPdev · 2026-02-19
70.2%
#16968: fix(qmd): per-collection search to prevent large collections drowni...
by ProgramCaiCai · 2026-02-15
70.0%
#20791: Feature/aeon memory plugin
by mustafarslan · 2026-02-19
69.7%
#9149: Fix: Allow QMD backend to work without OpenAI auth
by vishaltandale00 · 2026-02-04
69.0%
#16917: fix(memory): close stale SQLite connection after qmd update
by zerone0x · 2026-02-15
68.5%
#17657: fix: clear QMD manager cache on in-process restart (SIGUSR1)
by IrriVisionTechnologies · 2026-02-16
67.4%
#20125: fix(doctor): skip memorySearch provider check when using QMD backend
by brandonwise · 2026-02-18
67.0%
#10801: fix: eagerly initialize QMD memory backend on gateway startup
by 1kuna · 2026-02-07
66.9%
#19022: memory: support per-agent QMD collection paths
by Whoaa512 · 2026-02-17
66.9%
#9381: Fix: Allow QMD CLI memory search when scope is restrictive
by vishaltandale00 · 2026-02-05
66.6%