← Back to PRs

#16777: feat(gateway): add multimodal image support to /v1/chat/completions

by dzianisv open 2026-02-15 03:45 View on GitHub →
gateway stale size: M
## Summary - Parse OpenAI-format `image_url` content parts from user messages and pass them through to the agent pipeline via `agentCommand()` - Follow the same pattern used by the `/v1/responses` endpoint for image extraction and validation - Add configurable image limits and increase default body size for base64 payloads ## Changes ### Image extraction (`src/gateway/openai-http.ts`) - `extractRawImageUrls()` — parses `{ type: "image_url", image_url: { url } }` content parts from user messages - `parseImageUrlToSource()` — handles both `data:` URIs (extracts mime + base64) and plain `https://` URLs - `resolveImages()` — collects raw image URLs, validates via `extractImageContentFromSource()`, returns `ImageContent[]` or `undefined` - `resolveImageLimits()` — merges user config with sensible defaults - Both streaming and non-streaming `agentCommand()` calls now pass `images` - Default max body size increased from 1MB → 20MB (`DEFAULT_CHAT_COMPLETIONS_BODY_BYTES`) ### Config plumbing - `GatewayHttpChatCompletionsConfig` — new `maxBodyBytes` and `images` fields (`src/config/types.gateway.ts`) - Config threaded through: `server-runtime-config.ts` → `server.impl.ts` → `server-runtime-state.ts` → `server-http.ts` → `openai-http.ts` ### Tests (`src/gateway/openai-http.e2e.test.ts`) 4 e2e test cases: 1. Single base64 image extracted and passed 2. Multiple images from multiple user messages 3. No images → `images` is `undefined` (not empty array) 4. Images in non-user messages (system/assistant) are ignored ## Testing ``` ✓ 4/4 e2e tests pass (openai-http.e2e.test.ts) ✓ 249/249 gateway unit tests pass (vitest.gateway.config.ts) ✓ tsc --noEmit passes (only pre-existing TS6059 ui/ warnings) ``` <!-- greptile_comment --> <h3>Greptile Summary</h3> Added multimodal image support to the `/v1/chat/completions` endpoint. Follows the same pattern as the existing `/v1/responses` endpoint for image extraction and validation. - Parses OpenAI-format `image_url` content parts from user messages only (system/assistant messages are correctly ignored) - Supports both base64 data URIs and remote URLs with SSRF protection via `fetchWithGuard` - Config properly threaded through all layers (server-runtime-config → server.impl → server-runtime-state → server-http → openai-http) - Default max body size increased from 1MB to 20MB to accommodate base64 image payloads - Image limits (allowUrl, allowedMimes, maxBytes, maxRedirects, timeoutMs) are configurable via `gateway.http.endpoints.chatCompletions.images` - Tests cover single image, multiple images, no images (returns `undefined`, not empty array), and non-user role filtering <h3>Confidence Score: 5/5</h3> - This PR is safe to merge with minimal risk - The implementation follows established patterns from the existing `/v1/responses` endpoint, reuses battle-tested image extraction logic with SSRF guards, has comprehensive e2e test coverage for all key scenarios, and the config changes are properly threaded through all layers. The code is well-structured and defensive. - No files require special attention <sub>Last reviewed commit: 40030e8</sub> <!-- greptile_other_comments_section --> <!-- /greptile_comment -->

Most Similar PRs