#19246: feat(media): add Google Vertex AI media provider
size: M
Cluster:
Google and Amazon AI Providers
# PR: Add Google Vertex AI Media Provider
## Summary
- **Problem**: Lack of a native Google Vertex AI provider for media understanding.
- **Why it matters**: Enterprise users using Google Cloud Gemini models needed native ADC support for transcribing audio and describing images/videos without manual api-key management.
- **What changed**: Added `google-vertex` media provider using the `pi-ai` SDK, integrated it into auto-resolution logic, and added unit tests.
- **What did NOT change**: Core orchestration logic for other providers remains stable.
## Change Type (select all)
- [ ] Bug fix
- [x] Feature
- [ ] Docs (Removed from this PR)
- [x] Testing (Added unit tests for Vertex provider)
## Scope (select all touched areas)
- [x] Integrations
- [x] API / contracts
- [x] Testing
## Linked Issue/PR
- Related # (Insert relevant issue number if applicable)
## User-visible / Behavior Changes
- New media provider `google-vertex` available for `tools.media.audio`, `tools.media.image`, and `tools.media.video`.
- Media pipeline now auto-resolves to `google-vertex` if the agent's primary model is a Vertex model.
## Security Impact (required)
- New permissions/capabilities? (`No`)
- Secrets/tokens handling changed? (`No` - uses standard GCP ADC)
- New/changed network calls? (`Yes` - calls to Google Vertex AI endpoints)
- Command/tool execution surface changed? (`No`)
- Data access scope changed? (`No`)
- If any `Yes`, explain risk + mitigation: Media data is sent to Google Vertex AI. Users must opt-in by enabling the provider and configuring GCP credentials.
## Repro + Verification
### Environment
- OS: Linux
- Model/provider: Google Vertex AI (Gemini 3 Flash Preview)
### Steps
1. Set `GOOGLE_APPLICATION_CREDENTIALS`.
2. Configure agent with `google-vertex` model.
3. Send media (audio/image) to the agent.
### Expected
- Automatic processing via Vertex Gemini models.
### Actual
- Media processed successfully; confirmed via local logs and human verification.
## Evidence
- [x] New unit tests in [src/media-understanding/providers/google-vertex/index.test.ts](src/media-understanding/providers/google-vertex/index.test.ts)
## Human Verification (required)
Verified audio transcription and image description scenarios using a live Google Cloud project with Application Default Credentials. Tested the auto-resolution logic by setting the primary model to `google-vertex`.
## Compatibility / Migration
- Backward compatible? (`Yes`)
- Config/env changes? (`Yes` - requires standard GCP environment variables)
- Migration needed? (`No`)
## Failure Recovery (if this breaks)
- How to disable/revert this change quickly: Revert the commit or unset Vertex configuration in `openclaw.json`.
- Files/config to restore: N/A.
- Known bad symptoms reviewers should watch for: 403 errors if Vertex AI API is not enabled in the GCP console.
## Risks and Mitigations
- Risk: User might have misconfigured ADC.
- Mitigation: Users should check GCP project settings; 401/403 errors will appear in debug logs.
---
**AI-Assisted PR**: This PR was prepared with assistance from GitHub Copilot (Gemini 3 Flash Preview).
https://docs.openclaw.ai/nodes/audio
<!-- greptile_comment -->
<h3>Greptile Summary</h3>
Adds a new `google-vertex` media provider for audio transcription, image description, and video description using the `pi-ai` SDK with Google Cloud Application Default Credentials (ADC). The provider is registered in the media understanding pipeline and added to all auto-resolution arrays.
- **Bug**: `google-vertex` is added to `AUTO_IMAGE_KEY_PROVIDERS` but missing from `DEFAULT_IMAGE_MODELS`, which causes image auto-resolution to silently skip this provider. Add `"google-vertex": "gemini-3-flash-preview"` to `DEFAULT_IMAGE_MODELS` to fix.
- The provider implementation itself is clean, with a shared `completeVertexMedia` helper handling timeout/abort and error propagation across all three media types.
- Test coverage includes audio, image, empty response, and timeout scenarios but omits `describeVideo` and uses minimal mock params that skip required type fields.
<h3>Confidence Score: 3/5</h3>
- This PR has a bug where image auto-resolution silently skips google-vertex; audio and video paths should work correctly.
- The missing DEFAULT_IMAGE_MODELS entry for google-vertex is a logical bug that prevents the advertised image auto-resolution feature from working. The provider implementation itself is sound, and audio/video auto-resolution should work since they don't require entries in a default models map. Score of 3 reflects a real but non-breaking bug — the feature is partially functional.
- Pay close attention to `src/media-understanding/defaults.ts` — the missing `DEFAULT_IMAGE_MODELS` entry for `google-vertex` needs to be added before merge.
<sub>Last reviewed commit: f4426b3</sub>
<!-- greptile_other_comments_section -->
<sub>(2/5) Greptile learns from your feedback when you react with thumbs up/down!</sub>
<!-- /greptile_comment -->
Most Similar PRs
#12624: feat: add google-vertex embedding provider for Vertex AI ADC auth
by swseo92 · 2026-02-09
83.2%
#17546: feat(memory): add native google-vertex embedding provider
by mike-hyperverse · 2026-02-15
77.6%
#21263: fix: add google-vertex support for Gemini 3.1 models
by pdd-cli · 2026-02-19
77.1%
#17462: fix(cache): enable cache retention for Google Vertex AI (#15525)
by rrenamed · 2026-02-15
76.4%
#14208: feat(media): add AssemblyAI audio transcription provider
by jmoraispk · 2026-02-11
75.0%
#15205: fix(models): normalize google-antigravity api field from google-gem...
by wboudy · 2026-02-13
74.4%
#23424: feat: add Gemini 3.1 Pro Preview support (google-gemini-cli)
by hongchanroh · 2026-02-22
74.1%
#13075: [Feature]: Add Gemini (Google Search grounding) as web_search provider
by akoscz · 2026-02-10
73.5%
#16786: fix: support google-antigravity OAuth for Gemini embeddings
by outsourc-e · 2026-02-15
72.5%
#21491: fix: classify Google 503 UNAVAILABLE as transient failover [AI-assi...
by ZPTDclaw · 2026-02-20
72.1%