#15639: fix(memory): serialize local embedding initialization to avoid duplicate model loads

by SubtleSpark open 2026-02-13 17:50 View on GitHub →

stale size: S

## Summary This PR fixes a race condition in local embeddings initialization by serializing `ensureContext()` in `createLocalEmbeddingProvider`. - Add a cached `initPromise` in `src/memory/embeddings.ts` so concurrent callers share the same initialization path. - Prevent duplicate `getLlama()` / `loadModel()` / `createEmbeddingContext()` calls under concurrent `embedBatch`. - Add a regression test in `src/memory/embeddings.test.ts` (`local embedding ensureContext concurrency`) that verifies model/context initialization happens only once across 4 concurrent calls. ## Why When indexing memory files concurrently, multiple calls can enter `ensureContext()` before the first initialization completes, causing repeated local model loads and instability (VRAM pressure / hangs depending on platform and model size). Related: #7547 ## Scope / Non-goals - No behavior change to embedding normalization. - No manager-level lock changes in `src/memory/manager.ts`. - No serialization change to `embedBatch` (keeps current parallel behavior). ## Testing - Added unit regression test: - `src/memory/embeddings.test.ts` - `local embedding ensureContext concurrency` - Verifies `getLlama`, `loadModel`, and `createEmbeddingContext` are each called exactly once under concurrent `embedBatch`. ## AI Assistance AI-assisted PR (drafting + analysis). I reviewed and understood the code and test behavior before submitting.  <h2>Greptile Overview</h2> <h3>Greptile Summary</h3> This PR fixes a concurrency race in the local embeddings provider by memoizing local model/context initialization (`ensureContext`) so concurrent `embedBatch`/`embedQuery` calls share a single `getLlama()` → `loadModel()` → `createEmbeddingContext()` path. It also adds a regression test that mocks `node-llama-cpp` and verifies initialization happens once across multiple concurrent `embedBatch` calls. Main concern: the new `initPromise` memoization needs a retry path on failure; as written, a single initialization error can permanently poison the provider instance with a rejected promise. <h3>Confidence Score: 4/5</h3> - Mostly safe, but has one retry/robustness bug in local embeddings init memoization. - The concurrency fix is straightforward and scoped, and the regression test covers the intended race. However, caching `initPromise` without clearing it on rejection can leave the provider stuck in a permanent failure state after a transient init error, which should be addressed before merging. - src/memory/embeddings.ts <sub>Last reviewed commit: 2bbe896</sub>  <sub>(2/5) Greptile learns from your feedback when you react with thumbs up/down!</sub>