← Back to PRs

#16346: feat: support image attachments in OpenAI chat completions endpoint

by sh1nj1 open 2026-02-14 17:13 View on GitHub →
gateway size: M
## What Add multimodal image support to the `/v1/chat/completions` endpoint, enabling clients to send images alongside text in chat messages. ## Why The OpenAI-compatible chat completions endpoint currently only extracts text content from messages. Clients sending multimodal content (e.g., `image_url` parts with base64 data URLs) have their images silently dropped. The underlying `agentCommand` already supports an `images` parameter, so this change wires up the missing extraction and passthrough. ## Changes **`src/gateway/openai-http.ts`** - Added `ImageContent` type (structurally compatible with `src/commands/agent/types.ts`) - Added `extractImages()` — parses `image_url` content parts, extracts base64 data URLs into `ImageContent` objects (remote URLs filtered out to avoid SSRF) - Updated `buildAgentPrompt()` to track images per conversation entry via parallel array, returns images from the last user message - Updated validation to accept image-only messages (no text), using `[image]` placeholder for `agentCommand` compatibility - Both streaming and non-streaming `agentCommand` calls now pass `images: prompt.images` **`src/gateway/openai-http-images.test.ts`** (new) - 12 unit tests for `extractImages` and `extractTextContent` covering: single/multiple images, non-data-url filtering, null/empty handling, text extraction from multimodal content, `input_text` type/property handling **`src/gateway/openai-http.e2e.test.ts`** - 5 e2e tests: image passthrough to agentCommand, text-only produces no images, only last user message images forwarded, image-only message acceptance, non-data-url filtering ## Security Only base64 `data:` URLs are accepted. Remote URLs in `image_url` parts are filtered out to prevent SSRF. ## Testing - **Fully tested**: 12 unit tests + 5 e2e tests - `pnpm build && pnpm check` pass ✅ - Tested locally with real image payloads via [Collavre](https://github.com/sh1nj1/plan42) integration ## AI Disclosure 🤖 Generated with AI assistance (Claude). I understand what the code does and have reviewed all changes. --- *Replaces #16308 (rebased on latest main for clean review)* <!-- greptile_comment --> <h3>Greptile Summary</h3> Adds multimodal image support to the `/v1/chat/completions` endpoint by extracting base64 data URLs from `image_url` content parts and passing them to `agentCommand`. The implementation properly filters remote URLs to prevent SSRF attacks and handles image-only messages using a `[image]` placeholder for compatibility with the existing message format requirements. The changes are well-tested with 12 unit tests and 5 e2e tests covering various scenarios including single/multiple images, filtering, and edge cases. <h3>Confidence Score: 5/5</h3> - This PR is safe to merge with minimal risk - The implementation is well-designed with proper security considerations (SSRF prevention), comprehensive test coverage (17 tests total), and clean integration with existing code. The ImageContent type matches the existing type definition, and the image-only message handling is correctly implemented. The PR has already addressed a previous review comment about test drift. - No files require special attention <sub>Last reviewed commit: 6165da8</sub> <!-- greptile_other_comments_section --> <!-- /greptile_comment -->

Most Similar PRs