#15649: fix: sanitize tool_use IDs on session write path

by aldoeliacim open 2026-02-13 18:04 View on GitHub →

agents stale size: S

## Summary Fixes #15621 — Compaction writes invalid tool_use IDs to session JSONL. ## Problem When compaction generates a summary via the LLM, the model can produce `tool_use` blocks with human-readable IDs like `" functions.message:0"` or `"functions.web_search:1"`. These violate Anthropic's `^[a-zA-Z0-9_-]+$` pattern and corrupt the session JSONL, causing all subsequent API calls to fail. ## Fix Sanitize tool call IDs in the `guardedAppend` write path (`session-tool-result-guard.ts`) before persistence, using the existing `sanitizeToolCallId()` utility from `tool-call-id.ts`. IDs that already match the safe pattern (`[a-zA-Z0-9_-]+`) are left untouched — no unnecessary rewrites. The sanitization runs on all assistant messages at persist time, so it catches both compaction-generated and any other LLM-produced invalid IDs. ## Tests - Added 2 e2e tests: one verifying invalid IDs are sanitized, one verifying valid IDs are preserved unchanged. - All 13 existing tests in `session-tool-result-guard.e2e.test.ts` pass.  <h2>Greptile Overview</h2> <h3>Greptile Summary</h3> This change adds a persistence-time sanitization pass for assistant tool call IDs in `installSessionToolResultGuard`, using `sanitizeToolCallId()` to prevent compaction-generated `tool_use` IDs (e.g. `functions.web_search:1`) from being written to session JSONL and breaking subsequent API calls. Two e2e tests were added to validate sanitization and preservation behavior. The new sanitizer is applied right after `sanitizeToolCallInputs()` on assistant messages, before `persistMessage()` and before tool-call IDs are added to the pending map used to synthesize missing tool results. Issues to address before merge: - The guard currently treats `^[a-zA-Z0-9_-]+$` as safe, but the repo’s strict tool-id sanitizer/validator in `tool-call-id.ts` is strictly alphanumeric; this can leave still-invalid IDs persisted. - Per-block sanitization can introduce ID collisions without de-duplication, potentially breaking toolResult matching. - The new tests encode conflicting expectations about whether underscores are allowed. <h3>Confidence Score: 2/5</h3> - This PR has correctness gaps around tool ID validity and uniqueness that can still break provider tool-call handling. - The core idea (sanitize at persistence) is sound, but the implemented “safe” regex contradicts the repo’s strict validator, and the per-block sanitization can introduce duplicate IDs. The new tests also conflict, suggesting the intended constraint isn’t enforced consistently. - src/agents/session-tool-result-guard.ts, src/agents/session-tool-result-guard.e2e.test.ts <sub>Last reviewed commit: bd289ad</sub>