← Back to PRs

#5588: fix(media): skip binary audio in file extraction to prevent false UTF-16 detection

by NSEvent open 2026-01-31 17:46 View on GitHub →
## Summary Audio files should be handled by the transcription pipeline, not file extraction. This prevents binary audio data from being falsely detected as UTF-16 text and injected as garbage into the context window. **Simplified fix:** Skip audio files unconditionally in `extractFileBlocks()` (unless they have a text extension like `.txt`). The complex BOM/heuristic detection is removed in favor of a simple `kind === "audio"` check. Closes #5552 Closes #5590 ## Test plan - [x] Added tests for binary OGG/Opus files that would trigger false UTF-16 detection - [x] Updated CSV/TSV tests to use appropriate file extensions - [x] All 18 tests pass - [x] Lint and build pass 🤖 Generated with [Claude Code](https://claude.com/claude-code)

Most Similar PRs