#21845: fix: use sequential embedding for local GGUF provider to prevent deadlock

by slegarraga open 2026-02-20 12:57 View on GitHub →

size: XS

## Summary Replaces `Promise.all` with a sequential loop in the local (node-llama-cpp) embedding provider's `embedBatch` implementation. ### Problem Concurrent calls to `getEmbeddingFor()` on the same `LlamaEmbeddingContext` can deadlock, causing memory indexing to hang indefinitely. Users see repeated "batch start" logs with 0 chunks indexed, and the process may eventually get SIGKILLed. ### Fix Process texts one at a time instead of concurrently. While slightly slower, local embedding batches are typically small (memory files are few) and this eliminates the hang entirely. ### Before ```ts const embeddings = await Promise.all( texts.map(async (text) => { const embedding = await ctx.getEmbeddingFor(text); return sanitizeAndNormalizeEmbedding(Array.from(embedding.vector)); }), ); ``` ### After ```ts const embeddings: number[][] = []; for (const text of texts) { const embedding = await ctx.getEmbeddingFor(text); embeddings.push(sanitizeAndNormalizeEmbedding(Array.from(embedding.vector))); } ``` ### Testing - TypeScript compiles clean (`pnpm tsgo --noEmit`) - Minimal change, follows the fix suggested in the original issue ### AI Disclosure 🤖 AI-assisted (Claude via OpenClaw). Code reviewed and understood by contributor. Closes #7547  <h3>Greptile Summary</h3> Replaces concurrent `Promise.all` with sequential for-loop in `embedBatch` to prevent deadlock with node-llama-cpp's embedding context. The fix addresses a critical issue where concurrent calls to `getEmbeddingFor()` on the same `LlamaEmbeddingContext` can hang indefinitely during memory indexing. - Fixed deadlock in local GGUF embedding provider by processing texts sequentially - Added explanatory comment with issue reference (#7547) - Maintains same functionality with slightly slower but reliable behavior - Existing tests remain valid and will pass with sequential execution <h3>Confidence Score: 5/5</h3> - This PR is safe to merge with minimal risk - The change is minimal, well-documented, and directly addresses a critical deadlock issue. The sequential approach is a proven pattern for avoiding concurrency issues with shared resources. The change preserves all existing functionality and test compatibility. While slightly slower, local embedding batches are typically small (memory files are few), making the performance trade-off acceptable for reliability. - No files require special attention <sub>Last reviewed commit: 6bced30</sub>  <sub>(3/5) Reply to the agent's comments like "Can you suggest a fix for this @greptileai?" or ask follow-up questions!</sub>