#13251: fix: Windows local embeddings race condition
stale
size: S
Cluster:
Memory Database Enhancements
## Problem
Local embeddings timeout/hang on Windows when using node-llama-cpp. The process would hang indefinitely during model initialization, eventually timing out after 300s.
## Root Cause
Race conditions in model initialization:
1. Concurrent calls to \embedBatch\ triggering parallel llama model loads
2. No locking around the lazy initialization of the embeddings provider
3. No locking around manager creation
## Fix
1. **Sequential \embedBatch\** - Process embedding requests sequentially rather than in parallel
2. **Initialization lock in \embeddings.ts\** - Mutex to prevent concurrent model loading
3. **Initialization lock in \manager.ts\** - Mutex to prevent concurrent manager creation
## Testing
- Tested on Windows 11 (x64)
- Node v22.22.0
- Model: embeddinggemma-300M-Q8_0.gguf via node-llama-cpp
- Before: 100% timeout rate on memory_search
- After: Memory search completes in 2-3 seconds
## Related
- Addresses Windows-specific issues with local embedding provider
- May also help Linux/Mac users experiencing similar race conditions
<!-- greptile_comment -->
<h2>Greptile Overview</h2>
<h3>Greptile Summary</h3>
This PR addresses a Windows hang in local embeddings by serializing initialization and embedding calls:
- `src/memory/embeddings.ts` adds a per-provider init promise/error cache around `getLlama()`/model loading/context creation and changes `embedBatch()` to run sequentially to avoid concurrent `getEmbeddingFor()` calls.
- `src/memory/manager.ts` adds a per-cache-key initialization lock to prevent concurrent `MemoryIndexManager` creation (and thus concurrent provider/model initialization).
Overall the approach fits the existing lazy-init/cached-manager design: providers are created on demand and cached per agent/workspace/settings, with the new locks preventing concurrent initialization races that are especially problematic on Windows/node-llama-cpp.
<h3>Confidence Score: 4/5</h3>
- This PR is generally safe to merge once a small type-level issue in the manager init lock map is fixed.
- The concurrency/locking changes are localized and consistent with the reported race-condition root cause. The main remaining issue is an incorrect `INDEX_INIT_LOCKS` promise type that can leak `null` into the `get()` fast-path return and/or force incorrect downstream typing.
- src/memory/manager.ts
<!-- greptile_other_comments_section -->
<!-- /greptile_comment -->
Most Similar PRs
#15639: fix(memory): serialize local embedding initialization to avoid dupl...
by SubtleSpark · 2026-02-13
85.4%
#21845: fix: use sequential embedding for local GGUF provider to prevent de...
by slegarraga · 2026-02-20
78.9%
#10550: feat(memory-lancedb): local embeddings via node-llama-cpp
by namick · 2026-02-06
74.2%
#17566: memory-lancedb: support local OpenAI-compatible embeddings
by lumenradley · 2026-02-15
71.8%
#8675: fix: Gemini batch embeddings state path, enum values, and download URL
by seasalim · 2026-02-04
70.8%
#23419: fix(memory): avoid cross-agent qmd embed serialization
by frankekn · 2026-02-22
69.8%
#20149: fix(memory): expose index concurrency as config option
by togotago · 2026-02-18
69.3%
#8309: fix: add emb_ prefix to batch embedding custom_id for OpenAI compli...
by vishaltandale00 · 2026-02-03
69.2%
#20315: fix(memory): add gemini-embedding-001 to GEMINI_MAX_INPUT_TOKENS
by Clawborn · 2026-02-18
69.2%
#20771: feat(memory-lancedb): support custom OpenAI-compatible embedding pr...
by marcodelpin · 2026-02-19
69.1%