#19415: fix(agents): enable repairToolUseResultPairing for OpenAI models

by wu-tian807 open 2026-02-17 19:36 View on GitHub →

agents size: XS

## Problem `limitHistoryTurns()` can orphan `toolResult` messages by removing the preceding `assistant` message that contained the matching `tool_calls`. The code already documents this — `attempt.ts` lines 671-673: ``` // Re-run tool_use/tool_result pairing repair after truncation, since // limitHistoryTurns can orphan tool_result blocks by removing the // assistant message that contained the matching tool_use. ``` The subsequent repair step (`sanitizeToolUseResultPairing`) handles this correctly, but `repairToolUseResultPairing` is disabled for OpenAI in `transcript-policy.ts`: ```typescript repairToolUseResultPairing: !isOpenAi && repairToolUseResultPairing, ``` OpenAI's Chat Completions API is the strictest about tool pairing — it rejects orphaned `toolResult` messages with `400 Invalid parameter: messages with role 'tool' must be a response to a preceding message with 'tool_calls'`. Once this happens, the session enters a death loop: retry → same truncated history → same 400 → retry. Note: this does **not** affect the OpenAI Responses API (`openai-responses`, `openai-codex-responses`), which is stateful and manages tool pairing server-side. ### Reproduction paths 1. **Pure OpenAI single-provider:** Session accumulates many tool calls → `limitHistoryTurns` truncates and orphans a `toolResult` → repair skipped → 400 error. 2. **Cross-provider heartbeat:** Upstream natively supports `heartbeat.model` with a different provider (e.g. Anthropic heartbeat, OpenAI primary). Heartbeat produces `tool_calls` in the shared session → truncation orphans a `toolResult` → primary model (OpenAI) picks up the session → repair skipped → 400 error. ## Fix ```typescript // New: distinguish stateful Responses API from stateless Chat Completions const isOpenAiResponsesApi = params.modelApi === "openai-responses" || params.modelApi === "openai-codex-responses"; // Before: repairToolUseResultPairing: !isOpenAi && repairToolUseResultPairing, // After: repairToolUseResultPairing: repairToolUseResultPairing || (isOpenAi && !isOpenAiResponsesApi), ``` This enables the existing, already-tested `repairToolUseResultPairing()` function for OpenAI Chat Completions models. The function only drops orphaned `toolResult` messages and synthesizes placeholder results for missing `toolResult` — both safe operations that align with OpenAI's strict message format requirements. **Behavior change by provider:** - Google / Anthropic: unchanged (`true || false` = `true`) - **OpenAI Chat Completions (`openai`, `openai-completions`): `false` → `true`** (fix) - OpenAI Responses API (`openai-responses`, `openai-codex-responses`): unchanged (`false`) - Other providers: unchanged (`false || false` = `false`) ## Test plan - [x] Added 4 regression tests verifying `repairToolUseResultPairing` is enabled for OpenAI Chat Completions, disabled for OpenAI Responses API, enabled for Anthropic, enabled for Google - [x] `vitest run src/agents/transcript-policy.test.ts` — 8 tests pass - [x] `vitest run src/agents/pi-embedded-runner.sanitize-session-history.test.ts` — passes (including "does not synthesize tool results for openai-responses") - [x] `pnpm tsgo --noEmit` — type check passes - [x] `pnpm format` — formatted with project oxfmt ### Changes (2 files, +44 -1) - `src/agents/transcript-policy.ts` — Add `isOpenAiResponsesApi` guard, enable repair for OpenAI Chat Completions - `src/agents/transcript-policy.test.ts` — 4 regression tests