#11877: feat(ollama): auto-detect vision capability via /api/show
agents
stale
Cluster:
Ollama Model Enhancements
## Summary
For Ollama models not explicitly configured in `openclaw.json`, the `resolveModel()` function falls back with `input: ["text"]` - no image support. This causes pi-ai to filter out all image content before sending to Ollama, even when the underlying model supports vision.
This PR adds automatic vision capability detection by querying Ollama's `/api/show` endpoint.
## Changes
- Added `queryOllamaVisionCapability()` helper that queries Ollama's `/api/show` endpoint to check if a model has `"vision"` in its capabilities
- Made `resolveModel()` async to support the API query
- Updated all callers (`run.ts`, `compact.ts`, `tts.ts`) to await the function
- Updated tests to be async
## How It Works
When falling back for Ollama models not in the explicit config:
1. Query `http://ollama-host/api/show` with the model name
2. Check if `capabilities` array includes `"vision"`
3. If yes, set `input: ["text", "image"]` instead of just `["text"]`
## Test Plan
1. Configure Ollama provider in `openclaw.json` without listing a vision-capable model
2. Use `/model ollama/llava` (or another vision model)
3. Send an image via the dashboard chat
4. Verify the model receives and processes the image
🤖 Generated with [Claude Code](https://claude.ai/code)
<!-- greptile_comment -->
<h2>Greptile Overview</h2>
<h3>Greptile Summary</h3>
This PR makes `resolveModel()` asynchronous so it can query Ollama’s `/api/show` endpoint when falling back to an unconfigured Ollama model, enabling automatic detection of `vision` capability (and therefore allowing `input: ["text", "image"]` in the fallback model). Call sites in the embedded runner (`run.ts`, `compact.ts`) and TTS summarization (`tts.ts`) were updated to `await` the model resolution, and unit tests were adjusted accordingly.
Main concern is the new outbound HTTP call for capability detection: it currently uses raw `fetch()` rather than the project’s guarded fetch utilities/SSRF policy layer. If the Ollama `baseUrl` is configurable in environments where configs are untrusted, this could allow requests to arbitrary/private hosts.
<h3>Confidence Score: 3/5</h3>
- This PR is likely mergeable after addressing a security-relevant fetch/SSRF policy gap in the new Ollama capability probe.
- Core logic changes are localized and call sites were updated to await the new async `resolveModel()`. However, the new `/api/show` probe uses raw `fetch()` and may bypass existing SSRF/network policy protections applied elsewhere to configurable base URLs, which is important to fix before merging.
- src/agents/pi-embedded-runner/model.ts
<!-- greptile_other_comments_section -->
<!-- /greptile_comment -->
Most Similar PRs
#4782: fix: Auto-discover Ollama models without requiring explicit API key
by spiceoogway · 2026-01-30
82.3%
#7278: feat(ollama): optimize local LLM support with auto-discovery and ti...
by alltomatos · 2026-02-02
81.2%
#5115: fix: guard against undefined model.name in Ollama discovery (#5062)
by TheWildHustle · 2026-01-31
80.5%
#16098: fix: omit tools param for models without tool support, surface erro...
by claw-sylphx · 2026-02-14
79.9%
#9257: this is my first fork
by demonking369 · 2026-02-05
77.9%
#11875: fix(ollama): accept /model directive for configured providers
by Nina-VanKhan · 2026-02-08
77.4%
#21977: Preserve provider API for discovered Ollama models
by graysurf · 2026-02-20
76.7%
#19612: feat(onboarding): add Ollama to onboarding provider list
by ParthSareen · 2026-02-18
76.3%
#7432: Comprehensive Ollama Support PR
by charlieduzstuf · 2026-02-02
76.0%
#18587: fix(ollama): improve timeout handling and cooldown logic for local ...
by manthis · 2026-02-16
76.0%