#14239: Add Azure OpenAI Completions provider

by KJFromMicromonic open 2026-02-11 19:29 View on GitHub →

agents stale

Cluster: Model Configuration and Fallback Fixes

## Summary - Registers a new `azure-openai-completions` API provider using the `AzureOpenAI` client from the `openai` SDK - Enables streaming chat completions from Azure-deployed models (Azure AI Foundry / Cognitive Services) - The standard `openai-completions` provider cannot target Azure because the OpenAI SDK's `buildURL()` overwrites `?api-version` query params, and Azure requires `api-key` header + `/deployments/{model}` path injection that `AzureOpenAI` handles automatically ### Changes - Add `openai` as a direct dependency (for `AzureOpenAI` import) - Expand `ModelApiSchema` and `ModelApi` type with `azure-openai-completions` and `azure-openai-responses` literals - Create `src/providers/azure-openai-completions.ts` — streaming loop mirrors pi-ai's openai-completions, using `AzureOpenAI` client - Wire up auto-registration via side-effect import in `models-config.providers.ts` - Add azure types to transcript policy `OPENAI_MODEL_APIS` set ### User config example ```json { "models": { "providers": { "my-azure": { "baseUrl": "https://myresource.cognitiveservices.azure.com/openai", "apiKey": "my-azure-api-key", "api": "azure-openai-completions", "models": [{ "id": "grok-4-fast-reasoning", "name": "Grok 4 Fast (Azure)", "reasoning": true, "input": ["text"], "contextWindow": 131072, "maxTokens": 32000, "cost": { "input": 0, "output": 0, "cacheRead": 0, "cacheWrite": 0 } }] } } } } ``` - `model.id` = Azure deployment name - `baseUrl` = Azure resource URL up to `/openai` - `api-version` defaults to `2024-12-01-preview`, overridable via `AZURE_OPENAI_API_VERSION` env var ## Test plan - [x] `pnpm build` passes with no type/build errors - [x] Gateway starts and recognizes `azure-openai-completions` provider - [x] Streaming responses work end-to-end from Azure-deployed model via TUI  <h2>Greptile Overview</h2> <h3>Greptile Summary</h3> Adds an `azure-openai-completions` API provider using the `AzureOpenAI` client from the `openai` SDK, enabling streaming chat completions from Azure-deployed models. The implementation mirrors the existing `openai-completions` provider's streaming loop and registers via side-effect import. - New `src/providers/azure-openai-completions.ts` handles client construction, parameter building, and streaming with text/reasoning/tool-call support - `ModelApi` type and Zod schema expanded with `azure-openai-completions` and `azure-openai-responses` literals - Transcript policy `OPENAI_MODEL_APIS` set updated for Azure APIs - `openai` added as a direct dependency (already a transitive dep via `@mariozechner/pi-ai`) - **Issue found**: Output token counting on line 264 likely double-counts reasoning tokens since `completion_tokens` already includes `reasoning_tokens` per the OpenAI API spec - `azure-openai-responses` is registered in types/schema but has no provider implementation — could confuse users if configured <h3>Confidence Score: 3/5</h3> - Functional but has a token counting bug that could cause incorrect usage/cost reporting. - The PR follows existing patterns well and the streaming implementation is solid. However, the reasoning token double-counting bug on line 264 would cause inflated output token counts and potentially incorrect cost calculations for reasoning models. The phantom `azure-openai-responses` type could also confuse users. - `src/providers/azure-openai-completions.ts` — token counting logic needs review; `src/config/zod-schema.core.ts` — unimplemented `azure-openai-responses` API type  <sub>(2/5) Greptile learns from your feedback when you react with thumbs up/down!</sub>