#10550: feat(memory-lancedb): local embeddings via node-llama-cpp
docs
extensions: memory-lancedb
stale
size: L
Cluster:
Memory Database Enhancements
## feat(memory-lancedb): local embeddings via node-llama-cpp
> **AI-assisted:** This PR was developed with Claude Code/Opus 4.6. I understand what the code does, have reviewed all changes, and tested on Ubuntu Linux. The full gate (`pnpm build && pnpm check && pnpm test`) passes.
### Summary
Adds offline embedding support to the `memory-lancedb` plugin using [node-llama-cpp](https://node-llama-cpp.withcat.ai/), alongside dimension mismatch detection and a reindex command for safe provider switching.
The plugin previously required an OpenAI API key for all embedding operations. This PR adds a `local` provider that runs GGUF embedding models entirely on-device — no API key, no network, no ongoing cost. Users who hit rate limits, work in air-gapped environments, or simply prefer keeping data local can now use long-term memory without any cloud dependency.
### What changed
**Local embedding provider**
- `LocalEmbeddingProvider` class wraps node-llama-cpp's `EmbeddingContext`
- Supports GGUF model files (local path or `hf:` URL for auto-download)
- L2-normalized output vectors for consistent similarity scoring
- Lazy model loading — downloaded and initialized on first use
**Provider auto-detection**
- Models ending in `.gguf` or starting with `hf:` auto-select `local`
- Models starting with `text-embedding-` auto-select `openai`
- Explicit `provider` field takes precedence
**Dimension mismatch safety**
- On startup, reads the Arrow schema from the existing LanceDB table and compares vector dimensions against the current config
- Throws `DimensionMismatchError` with a clear message if they differ (e.g. switching from OpenAI 1536-dim to local 768-dim)
- Instance stays un-initialized on mismatch — no silent corruption
**`openclaw ltm reindex` CLI command**
- Opens the old table without dimension validation, reads all entries
- Drops and recreates with the new dimension, re-embeds each entry through the current provider
- Reports progress and success/failure counts
**Documentation**
- New plugin doc: `docs/plugins/memory-lancedb.md` — config, local setup, CLI reference, provider switching
- Added to sidebar and linked from plugin list
### Config examples
```json
// OpenAI (existing, unchanged)
{
"embedding": {
"apiKey": "${OPENAI_API_KEY}",
"model": "text-embedding-3-small"
}
}
// Local (new)
{
"embedding": {
"provider": "local",
"model": "hf:nomic-ai/nomic-embed-text-v1.5-GGUF/nomic-embed-text-v1.5.f16.gguf"
}
}
```
### Files changed
| File | Change |
|------|--------|
| `extensions/memory-lancedb/index.ts` | LocalEmbeddingProvider, DimensionMismatchError, reindex CLI, schema validation |
| `extensions/memory-lancedb/config.ts` | Provider detection, local config parsing, dynamic dimensions |
| `extensions/memory-lancedb/index.test.ts` | 15 new tests (provider detection, dimensions, config, mismatch) |
| `extensions/memory-lancedb/openclaw.plugin.json` | Extended schema and UI hints for local provider |
| `extensions/memory-lancedb/package.json` | Added node-llama-cpp as optional dependency |
| `docs/plugins/memory-lancedb.md` | New plugin documentation |
| `docs/docs.json` | Sidebar entry |
| `docs/plugin.md` | Link to plugin doc |
### Testing
**Tested on:** Ubuntu Linux
**Degree of testing:** Lightly tested locally, fully tested via automated suite.
- [x] `pnpm build && pnpm check && pnpm test` all pass (5452 tests, 0 failures)
- [x] Provider auto-detection: `.gguf` → local, `text-embedding-*` → openai, explicit override
- [x] Vector dimension calculation: 1536 for OpenAI small, 3072 for large, 768 default for local
- [x] Config rejects missing apiKey for OpenAI, accepts omitted apiKey for local
- [x] Dimension mismatch throws on init when existing DB has different dimensions
- [x] `initializeUnchecked()` bypasses validation for reindex access
- [x] `recreateTable()` drops old table and creates with new dimensions
<!-- greptile_comment -->
<h2>Greptile Overview</h2>
<h3>Greptile Summary</h3>
- Adds a new `local` embedding provider backed by `node-llama-cpp`, plus provider auto-detection and updated config parsing/UI hints.
- Introduces LanceDB vector dimension validation on startup, raising a dedicated `DimensionMismatchError` when an existing table doesn’t match the configured embedding dimension.
- Adds `openclaw ltm reindex` CLI command to drop/recreate the table and re-embed all stored memories when switching providers.
- Updates docs and plugin schema to document/configure the new provider options and workflows.
<h3>Confidence Score: 3/5</h3>
- This PR is close to mergeable, but has a couple of config/CLI logic issues that can cause incorrect reindexing or guaranteed dimension mismatches in valid configurations.
- Core changes are straightforward and covered by new tests, but the runtime code currently computes vector dimensions from `embedding.model` even when local embeddings use `local.modelPath`, and the reindex command can reuse a stale embedding provider instance. Both can lead to incorrect data in the DB or startup failures after provider switching.
- extensions/memory-lancedb/index.ts (provider selection/vectorDim computation and ltm reindex embedding provider lifecycle)
<!-- greptile_other_comments_section -->
<sub>(2/5) Greptile learns from your feedback when you react with thumbs up/down!</sub>
<!-- /greptile_comment -->
Most Similar PRs
#17566: memory-lancedb: support local OpenAI-compatible embeddings
by lumenradley · 2026-02-15
85.5%
#19006: feat(memory-lancedb): OpenAI-compatible baseUrl + Ollama provider +...
by martinsen-assistant · 2026-02-17
85.0%
#20771: feat(memory-lancedb): support custom OpenAI-compatible embedding pr...
by marcodelpin · 2026-02-19
84.0%
#17030: feat(memory-lancedb): support Ollama and OpenAI-compatible embeddin...
by nightfullstar · 2026-02-15
81.4%
#17874: feat(memory-lancedb): Custom OpenAI BaseURL & Dimensions Support
by rish2jain · 2026-02-16
78.9%
#15639: fix(memory): serialize local embedding initialization to avoid dupl...
by SubtleSpark · 2026-02-13
77.7%
#12624: feat: add google-vertex embedding provider for Vertex AI ADC auth
by swseo92 · 2026-02-09
76.0%
#21816: Add configurable `dimensions` for embedding models (Matryoshka supp...
by matthewspear · 2026-02-20
75.4%
#20191: feat(memory): add Amazon Bedrock embedding provider (Nova 2)
by gabrielkoo · 2026-02-18
74.3%
#13251: fix: Windows local embeddings race condition
by fayrose · 2026-02-10
74.2%