← Back to PRs

#12812: fix(transcript-policy): sanitize tool call IDs for all non-OpenAI providers

by justin-nevins open 2026-02-09 18:33 View on GitHub →
agents stale
## Summary - Expands tool call ID sanitization from Google/Mistral-only to **all non-OpenAI providers** - Fixes session corruption when fallback models (e.g. nvidia/kimi) generate non-compliant tool IDs ## Problem When a primary model (e.g. Anthropic Claude) temporarily fails and a fallback model like nvidia/kimi handles the request, it can generate tool call IDs containing invalid characters: - `functions.write:0` (dots, colons) - `" toolu_012e... <|tool_call_argument_begin|>"` (raw model tokens) - Empty strings These IDs get stored in session history. When the primary model recovers, Anthropic rejects the entire session with: ``` messages.310.content.1.tool_use.id: String should match pattern '^[a-zA-Z0-9_-]+$' ``` This permanently corrupts the session — Anthropic can't process it, and the fallback model times out on the now-large context. ## Fix Changed `transcript-policy.ts` to sanitize tool call IDs for **all non-OpenAI providers** instead of just Google and Mistral: ```typescript // Before const sanitizeToolCallIds = isGoogle || isMistral; // After const sanitizeToolCallIds = !isOpenAi; ``` This is safe because: 1. **Idempotent** — already-compliant IDs pass through with minimal change (e.g. `toolu_01` → `toolu01`) 2. **Consistent** — the mapping is applied to both tool calls and results via the same `resolve()` function 3. **OpenAI exempted** — OpenAI's own format requirements are preserved ## Test plan - [x] Updated test: "does not sanitize tool call ids for non-Google APIs" → "sanitizes tool call ids for Anthropic APIs" (expects `true`) - [x] Existing OpenAI test still expects `sanitizeToolCallIds: false` ✓ - [x] All 9 tests in `pi-embedded-runner.sanitize-session-history.test.ts` pass 🤖 Generated with [Claude Code](https://claude.com/claude-code) <!-- greptile_comment --> <h2>Greptile Overview</h2> <h3>Greptile Summary</h3> This PR broadens session-history tool call ID sanitization so it applies to all non-OpenAI providers, preventing invalid tool call IDs (e.g. containing `:`/`.` or raw model tokens) from being persisted and later rejected when sessions are replayed to Anthropic. The change is implemented in `src/agents/transcript-policy.ts` by switching `sanitizeToolCallIds` from a Google/Mistral-only condition to `!isOpenAi`, and the test `src/agents/pi-embedded-runner.sanitize-session-history.test.ts` is updated to assert sanitization for Anthropic APIs while preserving the existing OpenAI exemption behavior. <h3>Confidence Score: 4/5</h3> - Mostly safe to merge, but clarify OpenAI exemption logic for OpenAI APIs behind non-OpenAI providers. - The change is small and covered by targeted tests, but the new `!isOpenAi` condition keys off provider/empty-provider behavior and may unintentionally sanitize tool IDs for OpenAI APIs when routed via aggregators, which conflicts with the stated intent to exempt OpenAI formatting requirements. - src/agents/transcript-policy.ts <!-- greptile_other_comments_section --> <sub>(2/5) Greptile learns from your feedback when you react with thumbs up/down!</sub> <!-- /greptile_comment -->

Most Similar PRs