#21845: fix: use sequential embedding for local GGUF provider to prevent deadlock
size: XS
Cluster:
Memory Database Enhancements
## Summary
Replaces `Promise.all` with a sequential loop in the local (node-llama-cpp) embedding provider's `embedBatch` implementation.
### Problem
Concurrent calls to `getEmbeddingFor()` on the same `LlamaEmbeddingContext` can deadlock, causing memory indexing to hang indefinitely. Users see repeated "batch start" logs with 0 chunks indexed, and the process may eventually get SIGKILLed.
### Fix
Process texts one at a time instead of concurrently. While slightly slower, local embedding batches are typically small (memory files are few) and this eliminates the hang entirely.
### Before
```ts
const embeddings = await Promise.all(
texts.map(async (text) => {
const embedding = await ctx.getEmbeddingFor(text);
return sanitizeAndNormalizeEmbedding(Array.from(embedding.vector));
}),
);
```
### After
```ts
const embeddings: number[][] = [];
for (const text of texts) {
const embedding = await ctx.getEmbeddingFor(text);
embeddings.push(sanitizeAndNormalizeEmbedding(Array.from(embedding.vector)));
}
```
### Testing
- TypeScript compiles clean (`pnpm tsgo --noEmit`)
- Minimal change, follows the fix suggested in the original issue
### AI Disclosure
🤖 AI-assisted (Claude via OpenClaw). Code reviewed and understood by contributor.
Closes #7547
<!-- greptile_comment -->
<h3>Greptile Summary</h3>
Replaces concurrent `Promise.all` with sequential for-loop in `embedBatch` to prevent deadlock with node-llama-cpp's embedding context. The fix addresses a critical issue where concurrent calls to `getEmbeddingFor()` on the same `LlamaEmbeddingContext` can hang indefinitely during memory indexing.
- Fixed deadlock in local GGUF embedding provider by processing texts sequentially
- Added explanatory comment with issue reference (#7547)
- Maintains same functionality with slightly slower but reliable behavior
- Existing tests remain valid and will pass with sequential execution
<h3>Confidence Score: 5/5</h3>
- This PR is safe to merge with minimal risk
- The change is minimal, well-documented, and directly addresses a critical deadlock issue. The sequential approach is a proven pattern for avoiding concurrency issues with shared resources. The change preserves all existing functionality and test compatibility. While slightly slower, local embedding batches are typically small (memory files are few), making the performance trade-off acceptable for reliability.
- No files require special attention
<sub>Last reviewed commit: 6bced30</sub>
<!-- greptile_other_comments_section -->
<sub>(3/5) Reply to the agent's comments like "Can you suggest a fix for this @greptileai?" or ask follow-up questions!</sub>
<!-- /greptile_comment -->
Most Similar PRs
#15639: fix(memory): serialize local embedding initialization to avoid dupl...
by SubtleSpark · 2026-02-13
79.0%
#13251: fix: Windows local embeddings race condition
by fayrose · 2026-02-10
78.9%
#8675: fix: Gemini batch embeddings state path, enum values, and download URL
by seasalim · 2026-02-04
72.3%
#10550: feat(memory-lancedb): local embeddings via node-llama-cpp
by namick · 2026-02-06
71.8%
#20149: fix(memory): expose index concurrency as config option
by togotago · 2026-02-18
71.0%
#5808: fix(memory): truncate oversized chunks before embedding
by douvy · 2026-02-01
70.3%
#8309: fix: add emb_ prefix to batch embedding custom_id for OpenAI compli...
by vishaltandale00 · 2026-02-03
70.3%
#20913: fix: intercept Discord embed images to enforce mediaMaxMb
by MumuTW · 2026-02-19
70.3%
#7810: fix: add fetch timeouts to prevent memory indexing hangs (#4370)
by Kaizen-79 · 2026-02-03
69.3%
#23419: fix(memory): avoid cross-agent qmd embed serialization
by frankekn · 2026-02-22
69.3%