← Back to PRs

#16894: Fix text truncation splitting surrogate pairs in web-fetch, subagents, and channel metadata

by Clawborn open 2026-02-15 06:48 View on GitHub →
agents stale size: S trusted-contributor
Several `truncateText` helpers use raw `String.slice()` which can split a surrogate pair (emoji like 🎉, CJK extension B+) and produce lone surrogates in the output. The cron tool and cron normalizer already use `truncateUtf16Safe` for this; this PR aligns the remaining call sites. **Affected files:** - `src/agents/tools/web-fetch-utils.ts` — `truncateText` used by `web_fetch` tool output - `src/agents/tools/subagents-tool.ts` — `truncate` used for subagent result summaries - `src/security/channel-metadata.ts` — `truncateText` used for untrusted channel metadata **Fix:** Replace `value.slice(0, n)` with `truncateUtf16Safe(value, n)` from `src/utils.ts`, which already handles surrogate boundary detection. **Tests:** 5 new test cases across two files verifying emoji and CJK text are not corrupted at truncation boundaries. <!-- greptile_comment --> <h3>Greptile Summary</h3> Replaces raw `String.slice()` truncation with the existing `truncateUtf16Safe` utility in three call sites (`web-fetch-utils.ts`, `subagents-tool.ts`, `channel-metadata.ts`) to prevent splitting UTF-16 surrogate pairs (emoji, CJK extension B+) during text truncation. This aligns these helpers with the cron tool and cron normalizer, which already use the safe variant. - **`web-fetch-utils.ts`**: `truncateText` now uses `truncateUtf16Safe` instead of `value.slice(0, maxChars)` - **`subagents-tool.ts`**: `truncate` helper now uses `truncateUtf16Safe` instead of `text.slice(0, maxLength)` - **`channel-metadata.ts`**: `truncateText` for untrusted metadata now uses `truncateUtf16Safe` instead of `value.slice(0, ...)` - Two new test files with 5 test cases verify that emoji and mixed CJK/emoji text are not corrupted at truncation boundaries <h3>Confidence Score: 5/5</h3> - This PR is safe to merge — it makes minimal, well-scoped changes that replace unsafe string slicing with an already-proven utility function. - All three changes are mechanical substitutions of `String.slice()` with the existing `truncateUtf16Safe` utility that is already used elsewhere in the codebase. The utility's behavior is well-defined and tested. The new test files provide adequate coverage of the surrogate pair safety invariant. No behavioral regressions are introduced — the only difference is that truncation now backs off by one character when it would otherwise split a surrogate pair. - No files require special attention <sub>Last reviewed commit: b014183</sub> <!-- greptile_other_comments_section --> <!-- /greptile_comment -->

Most Similar PRs