#5401: fix(media-understanding): detect audio binary by magic bytes to prevent context injection
Cluster:
Media Handling Improvements
## Summary
- Adds magic byte detection for audio files (MP3, WAV, OGG, FLAC, M4A, AAC, AIFF, WMA) to prevent malicious text content from being injected into LLM context
- Audio files are now validated by checking their binary signature before processing, similar to existing image validation
- Prevents attackers from crafting files with valid audio extensions but containing executable prompts or malicious instructions
## Test plan
- [x] Unit tests added for all supported audio formats
- [x] Tests verify that valid audio files pass validation
- [x] Tests verify that fake audio files (text content with audio extension) are rejected
- [x] All existing tests pass (4905 tests)
<!-- greptile_comment -->
<h2>Greptile Overview</h2>
<h3>Greptile Summary</h3>
This PR adds an audio “magic bytes” detector in `src/media-understanding/apply.ts` and uses it during file-attachment processing to try to prevent text content disguised as audio from being extracted into the LLM context.
The new logic introduces a set of audio-like extensions and a small list of binary signatures (OGG/ID3/MP3 frame sync/RIFF/FLAC/EBML). During `extractFileBlocks`, it checks the attachment’s extension and buffer and conditionally skips file extraction.
<h3>Confidence Score: 2/5</h3>
- This PR has a likely logic inversion that can undermine the intended security fix.
- While the change is localized, the new early-`continue` triggers when audio magic bytes are present, which appears opposite to the stated intent (reject text masquerading as audio). If merged as-is, it may still allow prompt injection via “.mp3”/etc files containing text, and may also skip legitimate file attachments. Additionally, some signatures (RIFF/EBML) are broad and can misclassify non-audio containers.
- src/media-understanding/apply.ts (audio magic-byte gating logic and signature strictness)
<!-- greptile_other_comments_section -->
<sub>(2/5) Greptile learns from your feedback when you react with thumbs up/down!</sub>
**Context used:**
- Context from `dashboard` - CLAUDE.md ([source](https://app.greptile.com/review/custom-context?memory=fd949e91-5c3a-4ab5-90a1-cbe184fd6ce8))
- Context from `dashboard` - AGENTS.md ([source](https://app.greptile.com/review/custom-context?memory=0d0c8278-ef8e-4d6c-ab21-f5527e322f13))
<!-- /greptile_comment -->
Most Similar PRs
#4235: fix(media): skip audio in extractFileBlocks + hasBinaryAudioMagic d...
by null-runner · 2026-01-29
86.5%
#8048: Media: add regression test for audio text blocks (#7970)
by Abhishek-B-R · 2026-02-03
84.2%
#11443: LINE: fix buffer guards in detectContentType + add tests
by MdRahmatUllah · 2026-02-07
79.5%
#8388: fix(media): auto-skip tiny/empty audio files before transcription (...
by Glucksberg · 2026-02-04
78.6%
#11160: Media: add missing audio MIME-to-extension mappings (aac, flac, opu...
by lailoo · 2026-02-07
78.4%
#7454: fix: skip UTF-16 heuristic for audio/video/image MIME types (#7444)
by gavinbmoore · 2026-02-02
76.5%
#21110: fix(tts): deliver audio via structured mediaUrl instead of MEDIA: t...
by hydro13 · 2026-02-19
75.0%
#14794: fix: parse inline MEDIA: tokens in agent replies
by explainanalyze · 2026-02-12
74.9%
#19868: fix: prevent media token regex from matching markdown bold text
by sanketgautam · 2026-02-18
74.2%
#18811: fix(media): require file extension for ambiguous MEDIA: path detection
by aldoeliacim · 2026-02-17
73.8%