#8675: fix: Gemini batch embeddings state path, enum values, and download URL
commands
size: S
Cluster:
Gemini API Enhancements
## Summary
Fixes #5774 - Three bugs preventing Gemini batch embeddings from working for memory indexing:
- **State path**: Code read `status.state` but API returns `status.metadata.state`
- **Enum values**: Code checked `"SUCCEEDED"` but API returns `"BATCH_STATE_SUCCEEDED"`
- **Download URL**: Code used `:download` suffix but API requires `?alt=media`
## Changes
- Added `state?: string` to `GeminiBatchStatus.metadata` type
- Added `normalizeGeminiBatchState()` helper that:
- Checks `metadata.state` first, falls back to `status.state`
- Strips `BATCH_STATE_` and `JOB_STATE_` prefixes
- Maintains backwards compatibility with unprefixed values
- Fixed download URL from `${file}:download` to `${file}?alt=media`
## Tests Completed
- [x] `pnpm build` passes
- [x] `pnpm check` passes (lint/format)
- [x] `pnpm test src/memory/embeddings.test.ts` passes
- [x] Manual test with `OPENCLAW_DEBUG_MEMORY_EMBEDDINGS=1 pnpm openclaw memory index --force --verbose` completes successfully
🤖 Generated with [Claude Code](https://claude.ai/code)
<!-- greptile_comment -->
<h2>Greptile Overview</h2>
<h3>Greptile Summary</h3>
This PR fixes Gemini async batch embeddings integration by (1) reading the batch state from `status.metadata.state` (with fallback to `status.state`), (2) normalizing prefixed enum values like `BATCH_STATE_SUCCEEDED`/`JOB_STATE_RUNNING` to the unprefixed values the code expects, and (3) adjusting file download URLs from `:download` to `?alt=media`.
These changes improve reliability of memory indexing via Gemini batch embeddings by aligning the client code with the API’s actual response schema and download semantics.
<h3>Confidence Score: 4/5</h3>
- This PR is largely safe to merge and aligns the Gemini batch client with the API’s actual response shapes.
- Changes are localized to the Gemini batch embedding flow and primarily adjust parsing/normalization and URL construction. The only noteworthy concern is the empty-string fallback for `outputFileId` in the early-completion branch, which can obscure missing-output issues and is inconsistent with the wait path.
- src/memory/batch-gemini.ts (early-completion outputFileId fallback)
<!-- greptile_other_comments_section -->
<sub>(2/5) Greptile learns from your feedback when you react with thumbs up/down!</sub>
**Context used:**
- Context from `dashboard` - CLAUDE.md ([source](https://app.greptile.com/review/custom-context?memory=fd949e91-5c3a-4ab5-90a1-cbe184fd6ce8))
- Context from `dashboard` - AGENTS.md ([source](https://app.greptile.com/review/custom-context?memory=0d0c8278-ef8e-4d6c-ab21-f5527e322f13))
<!-- /greptile_comment -->
Most Similar PRs
#8309: fix: add emb_ prefix to batch embedding custom_id for OpenAI compli...
by vishaltandale00 · 2026-02-03
80.9%
#21843: fix: add retry/backoff to Gemini embedding batch API calls
by slegarraga · 2026-02-20
80.4%
#15301: Feat/gemini overflow and tags
by divisonofficer · 2026-02-13
79.9%
#15585: fix: add retry/backoff for Gemini embedding API calls
by WalterSumbon · 2026-02-13
78.7%
#17701: fix(memory-lancedb): add gemini-embedding-001 and baseUrl support
by Phineas1500 · 2026-02-16
78.7%
#7913: fix: fixed gemini-cli usage not working for preview models
by RomanHotsiy · 2026-02-03
77.4%
#7781: fix: resolve Google Gemini CLI auth credential extraction #4585
by ManojPanda3 · 2026-02-03
76.3%
#5808: fix(memory): truncate oversized chunks before embedding
by douvy · 2026-02-01
76.2%
#15852: fix: pass agentId when resolving IRC session paths
by MisterGuy420 · 2026-02-14
75.5%
#16786: fix: support google-antigravity OAuth for Gemini embeddings
by outsourc-e · 2026-02-15
75.1%