#8062: feat: add image pre-analysis with imageModel for non-vision models
agents
size: M
Cluster:
Image Model Enhancements
## Why This Feature Matters
Many users rely on cost-effective or high-performance models that don't have native vision capabilities. This feature bridges that gap without requiring users to switch to more expensive vision models for every request.
## Summary
When `agents.defaults.imageModel` is configured, images in user messages are first analyzed using the configured imageModel, then the text analysis results are passed to the main model.
This enables models without native vision capabilities (e.g., MiniMax M2.1, GLM) to understand image content through a vision-capable model (e.g., Gemini Flash, GPT-5).
## How it works
**Before (current behavior):**
```
User message (with image) → Main model → Response
↑
(if model supports images, they're passed directly;
otherwise images are ignored)
```
**After (with this PR):**
```
User message (with image) → imageModel configured?
├─ Yes → imageModel analyzes image
│ ├─ Success → Analysis text + prompt → Main model → Response
│ └─ Failed → Fallback to main model (if it supports images)
└─ No → Main model handles directly (existing behavior)
```
## Key Behavior
1. **imageModel takes priority**: When configured, imageModel is always used for image analysis first
2. **Graceful fallback**: If imageModel fails and main model supports images, falls back to passing images directly
3. **Backward compatible**: Without imageModel configured, behavior is unchanged
## Configuration Example
```json
{
"agents": {
"defaults": {
"model": {
"primary": "minimax/MiniMax-M2.1",
"fallbacks": ["anthropic/claude-3-opus"]
},
"imageModel": {
"primary": "gemini-crs/gemini-3-flash-preview",
"fallbacks": ["openai/gpt-4o"]
}
}
}
}
```
## Changes
| File | Description |
|------|-------------|
| `src/agents/pi-embedded-runner/run/image-pre-analysis.ts` | New module with `shouldUseImagePreAnalysis()` and `analyzeImagesWithImageModel()` functions |
| `src/agents/pi-embedded-runner/run/image-pre-analysis.test.ts` | Unit tests for the new module (10 tests) |
| `src/agents/pi-embedded-runner/run/attempt.ts` | Integrated image pre-analysis into the prompt flow |
## Test Results
```
✓ src/agents/pi-embedded-runner/run/image-pre-analysis.test.ts (10 tests) 3ms
Test Files 1 passed (1)
Tests 10 passed (10)
```
## Manual Testing
- [x] Configured `imageModel` with gemini-flash
- [x] Configured main `model` with opus (supports images)
- [x] Sent image via Feishu
- [x] Verified image was analyzed by imageModel first
- [x] Verified analysis text was passed to main model
---
**Note**: This is a re-submission of #4802 which was automatically closed. All feedback has been addressed and the branch has been rebased onto the latest main.
Most Similar PRs
#20572: feat (agents): add force option to imageModel for cheaper/better vi...
by primevalsoup · 2026-02-19
78.3%
#23467: feat: support image model resolution from media tool config
by sreerevanth · 2026-02-22
70.4%
#16346: feat: support image attachments in OpenAI chat completions endpoint
by sh1nj1 · 2026-02-14
69.8%
#8660: fix: respect agents.defaults.models.*.params.maxTokens in image tool
by dbottme · 2026-02-04
68.8%
#21088: fix: sessions_sspawn model override ignored for sub-agents
by Slats24 · 2026-02-19
66.1%
#11782: fix: resolve 403 auth error for GithubCopilot imageModel (#10277)
by adamkoncz · 2026-02-08
66.1%
#14647: feat(plugins): allow before_agent_start hook to override model (#14...
by lailoo · 2026-02-12
65.8%
#4459: fix: enable image input for Kimi K2.5 and refresh stale config mode...
by manikv12 · 2026-01-30
65.3%
#16018: feat: add image support to /v1/chat/completions endpoint
by sebastienb · 2026-02-14
64.8%
#22072: Fix responsePrefix {model} to use runtime model metadata
by graysurf · 2026-02-20
64.6%