#22269: feat(memory): add remote memory backend via OpenAI-compatible Vector Store API

by yossiovadia open 2026-02-20 23:41 View on GitHub →

gateway commands size: L

Cluster: Image Processing and Memory Management Fixes

## Summary Add a `"remote"` memory backend that syncs workspace files to any OpenAI-compatible Vector Store API and enables semantic search for contextual retrieval. - **RemoteVectorStoreClient**: HTTP client for Vector Store API (retry logic, timeouts, error handling) - **RemoteVectorStoreManager**: Sync engine — watches workspace, uploads/chunks files, performs semantic search - **RemoteManifest**: Tracks synced files locally (SHA-256 hash-based change detection) - **Config**: `memory.backend = "remote"` with `baseUrl`, `vectorStoreName`, sync interval, search params - **Feature-flagged**: Zero behavior change when `memory` config section is absent Original implementation by @rootfs (Huamin Chen). Validated and benchmarked by @yossiovadia. ## Change Type (select all) - [ ] Bug fix - [x] Feature - [ ] Refactor - [ ] Docs - [ ] Security hardening - [ ] Chore/infra ## Scope (select all touched areas) - [ ] Gateway / orchestration - [ ] Skills / tool execution - [ ] Auth / tokens - [x] Memory / storage - [ ] Integrations - [ ] API / contracts - [ ] UI / DX - [ ] CI/CD / infra ## Linked Issue/PR - Related: https://github.com/rootfs/openclaw/tree/remote-memory (original implementation) ## Motivation OpenClaw's current memory loads recognized bootstrap files (`MEMORY.md`, `USER.md`, etc.) into every session — the entire file, every turn, regardless of relevance. This has two limitations: 1. **No cross-file search**: Daily notes, session transcripts, and other workspace files are never surfaced unless explicitly loaded by bootstrap 2. **Fixed context cost**: `MEMORY.md` grows over time, consuming more tokens per turn even when most content isn't relevant to the current query The remote memory backend addresses both by enabling semantic search across all workspace files. Only relevant chunks are injected per turn. ## How It Works ``` ┌─────────────────────────────────────────────────────────────────┐ │ OpenClaw Agent │ │ │ │ ┌─── Bootstrap (always loaded) ──┐ ┌─── Remote Memory ─────┐ │ │ │ USER.md, SOUL.md, IDENTITY.md │ │ Semantic search via │ │ │ │ CLAUDE.md, AGENTS.md, TOOLS.md │ │ Vector Store API │ │ │ │ (fixed size, every turn) │ │ (~1-2KB relevant │ │ │ └────────────────────────────────┘ │ chunks, on demand) │ │ │ └──────────┬────────────┘ │ └─────────────────────────────────────────────────┼──────────────┘ │ ┌──────────────────────────┐ │ Any OpenAI-Compatible │ │ Vector Store API │ │ │ │ POST /v1/vector_stores │ │ POST /v1/files │ │ POST /v1/vector_stores/ │ │ {id}/search │ └──────────────────────────┘ ``` 1. **On boot**: Syncs workspace files (`.md`, `.txt`, `.json`, `.csv`, `.html`) to the remote vector store 2. **On each turn**: Searches the store with the user's message, injects top-K relevant chunks 3. **Periodic sync**: Detects new/changed/deleted files every `syncIntervalMs` ## Configuration ```json { "memory": { "backend": "remote", "remote": { "baseUrl": "http://127.0.0.1:8080", "vectorStoreName": "openclaw-memory", "syncIntervalMs": 30000, "searchMaxResults": 5, "searchScoreThreshold": 0.3 } } } ``` | Setting | Description | Default | |---------|-------------|---------| | `baseUrl` | Vector Store API base URL | required | | `apiKey` | API key (if backend requires auth) | — | | `headers` | Extra HTTP headers | `{}` | | `vectorStoreId` | Use existing store by ID | — | | `vectorStoreName` | Named store (auto-created if missing) | `openclaw-memory-{agentId}` | | `syncIntervalMs` | Workspace file sync interval | `300000` (5 min) | | `searchMaxResults` | Max chunks returned per search | `10` | | `searchScoreThreshold` | Minimum similarity score (0-1) | `0.3` | Works with any backend implementing the OpenAI Vector Store API surface. Tested with [vLLM Semantic Router](https://github.com/vllm-project/semantic-router). ## Recommended: Hybrid Mode The remote memory backend works **alongside** bootstrap, not instead of it. Bootstrap still auto-loads recognized filenames (`USER.md`, `SOUL.md`, `IDENTITY.md`, `CLAUDE.md`, etc.) into every session. The remote backend adds semantic search on top. However, `MEMORY.md` is a special case — it's loaded by bootstrap (full file, every turn) **and** indexed by the remote backend (searched semantically). This means it's sent twice: once in full via bootstrap, and again as relevant chunks via search. To avoid this redundancy and get the token savings shown in the benchmarks below: **Rename `MEMORY.md`** to a name that bootstrap doesn't recognize (e.g., `long-term-memory.md`, `knowledge-base.md`, or move it to `memory/knowledge.md`). The remote backend will still index and search it, but bootstrap won't dump the full file into every turn. | Layer | Files | When | |-------|-------|------| | Bootstrap (always loaded) | `USER.md`, `SOUL.md`, `IDENTITY.md`, `CLAUDE.md`, `AGENTS.md`, `TOOLS.md` | Every turn, full file | | Remote memory (on demand) | Everything in workspace (including renamed `MEMORY.md`, daily notes, session transcripts) | Per turn, only relevant chunks | Without the rename, the feature still works (cross-file search is enabled, no data loss) — you just won't see the per-turn token reduction since `MEMORY.md` is loaded by both systems. ## Validation Results Tested on a live OpenClaw instance with 14 workspace files (knowledge base, daily notes, session transcripts). Compared old method (bootstrap + MEMORY.md) vs hybrid (bootstrap + remote semantic search). ### Retrieval Benchmark (20 questions, 4 categories) ``` OLD (bootstrap+MEMORY.md) HYBRID (bootstrap+VSR) Recall: 14 / 20 (70%) 17 / 20 (85%) Bytes per turn: 29,230 (fixed) ~18,828 (35% less) ``` | Category | OLD | HYBRID | Notes | |----------|-----|--------|-------| | Identity (5) | 5/5 | 5/5 | Both handle USER.md, SOUL.md via bootstrap | | Factual (6) | 6/6 | 5/6 | OLD has MEMORY.md in bootstrap; HYBRID searches it | | Cross-file (7) | 1/7 | 6/7 | **OLD can never reach daily notes/transcripts** | | Convention (2) | 2/2 | 1/2 | CLAUDE.md in bootstrap for both | ### End-to-End Benchmark (8 questions, real LLM responses) Actual questions sent to the LLM with each method's context injected. Responses checked for correctness. ``` OLD (bootstrap+MEMORY.md): 4/8 correct (50%) HYBRID (bootstrap+remote): 7/8 correct (88%) ``` | # | Type | Question | OLD | HYBRID | |---|------|----------|-----|--------| | 1 | Identity | User's full name? | ✅ | ✅ | | 2 | Factual | Dashboard port? | ✅ | ✅ | | 3 | Factual | tmux socket path? | ✅ | ✅ | | 4 | Factual | GitHub monitor frequency? | ✅ | ❌ | | 5 | Cross-file | WebSocket error code? | ❌ | ✅ | | 6 | Cross-file | Terminal Replay API endpoints? | ❌ | ✅ | | 7 | Cross-file | LaunchAgent plist filename? | ❌ | ✅ | | 8 | Cross-file | bufferutil architecture bug? | ❌ | ✅ | On cross-file questions (5-8), the old method responds "I don't know" while the hybrid provides detailed, correct answers sourced from daily notes and session transcripts. ## Files Changed ### New files (730 lines) | File | Lines | Purpose | |------|-------|---------| | `src/memory/remote-client.ts` | 246 | HTTP client for Vector Store API | | `src/memory/remote-manager.ts` | 388 | Workspace sync engine + semantic search | | `src/memory/remote-manifest.ts` | 96 | Local file tracking with hash-based change detection | ### Modified files (158 lines added, 11 removed) | File | Change | |------|--------| | `src/memory/backend-config.ts` | Add `resolveRemoteConfig()` + `"remote"` backend path | | `src/memory/search-manager.ts` | Wire `RemoteVectorStoreManager` into search pipeline | | `src/memory/types.ts` | Add `"remote"` to `MemoryBackend` union type | | `src/memory/index.ts` | Export remote manager | | `src/config/types.memory.ts` | Add `MemoryRemoteConfig` type | | `src/config/zod-schema.ts` | Zod schema validation for `memory.remote` | | `src/config/schema.help.ts` | User-facing config documentation | | `src/commands/doctor-memory-search.ts` | Health check for remote backend | | `src/gateway/server-startup-memory.ts` | Boot sync initialization | **Total: ~888 lines added, 11 modified. Zero lines removed from existing code.** ## Impact on Existing Code None when unconfigured. The remote backend only activates when `memory.backend` is explicitly set to `"remote"` with a valid `baseUrl`. All existing behavior (`builtin`, `qmd`) is unchanged. ## User-visible / Behavior Changes New `memory.backend = "remote"` option. No changes to existing memory backends. ## Security Impact (required) - New permissions/capabilities? `No` — uses existing HTTP client patterns - Secrets/tokens handling changed? `apiKey` is an optional config field, handled the same way as existing provider API keys - Request body limits: inherits from remote backend (not controlled by OpenClaw) - File uploads: only workspace files (`.md`, `.txt`, etc.) are uploaded; paths validated to stay within workspace ## Testing - Tested locally on a live OpenClaw instance (macOS, gateway mode) - 14 workspace files synced and indexed successfully - Retrieval benchmark: 20 questions across 4 categories - E2E benchmark: 8 questions with real LLM responses - Boot sync, interval sync, file change detection, file deletion all verified - Graceful handling when vector store backend is una...