← Back to PRs

#20023: Fix surrogate pair splitting in channel metadata truncation

by Clawborn open 2026-02-18 12:06 View on GitHub →
size: S trusted-contributor
`truncateText` in `channel-metadata.ts` uses `.slice()` which can cut between a UTF-16 high and low surrogate, producing invalid strings. This affects channel metadata for groups/topics with emoji in their names. Check for a high surrogate at the cut boundary and step back one position. Also adds unit tests for `buildUntrustedChannelMetadata`. <!-- greptile_comment --> <h3>Greptile Summary</h3> Fixes a bug where `truncateText` in `channel-metadata.ts` could split UTF-16 surrogate pairs when truncating channel metadata strings (e.g., emoji in group/topic names). The fix checks for a high surrogate at the cut boundary and steps back one position to keep the pair intact. Also adds a comprehensive test suite for `buildUntrustedChannelMetadata`. - **Bug fix**: Prevents invalid strings from being produced when channel metadata with emoji is truncated at a surrogate pair boundary. - **Tests**: New `channel-metadata.test.ts` covers empty entries, null/undefined handling, deduplication, whitespace normalization, surrogate pair safety, and custom `maxChars`. - **Style note**: The codebase already has `sliceUtf16Safe`/`truncateUtf16Safe` in `src/utils.ts` that handle surrogate-pair-safe slicing. The inline fix duplicates that logic — consider reusing the existing utility for consistency. <h3>Confidence Score: 4/5</h3> - This PR is safe to merge — the fix is correct and well-tested, with only a minor style suggestion about reusing existing utilities. - The surrogate pair fix is logically sound and addresses a real bug. New tests provide good coverage. The only note is a style concern about duplicating existing utility logic from `src/utils.ts`. - No files require special attention. <sub>Last reviewed commit: 13733db</sub> <!-- greptile_other_comments_section --> <!-- /greptile_comment -->

Most Similar PRs