#8388: fix(media): auto-skip tiny/empty audio files before transcription (#8127)
size: M
experienced-contributor
Cluster:
Media Handling Improvements
Fixes #8127
When tiny or empty audio files are sent for transcription, the Whisper API returns unhelpful errors.
**Changes:**
- Added `MIN_AUDIO_FILE_BYTES = 1024` constant
- Added `"tooSmall"` to `MediaUnderstandingSkipReason`
- File size guard in both provider path (`runProviderEntry`) and CLI path (`runCliEntry`)
- Files below 1KB are skipped with a descriptive `MediaUnderstandingSkipError`
- 3 new tests (tiny file skip, empty file skip, valid file proceeds)
- Updated existing test fixtures to use buffers above the threshold
<!-- greptile_comment -->
<h2>Greptile Overview</h2>
<h3>Greptile Summary</h3>
Adds a minimum-audio-size guard (`MIN_AUDIO_FILE_BYTES = 1024`) so tiny/empty audio attachments are skipped before being sent to transcription providers/CLI, returning a `MediaUnderstandingSkipError` with new reason `tooSmall`. Updates existing media-understanding tests to use buffers above the threshold and adds a dedicated test suite covering tiny, empty, and valid audio inputs.
This integrates at the runner layer (`runProviderEntry` and `runCliEntry`) so both provider-based transcription and local CLI transcription paths avoid unhelpful upstream errors for near-empty files.
<h3>Confidence Score: 4/5</h3>
- This PR is likely safe to merge; the core guard is simple and well-covered by tests.
- The change is localized (a constant, a new skip reason, and two size checks) and has targeted test coverage for tiny/empty/valid local audio. Remaining concerns are mostly around test robustness/coverage for URL-based attachments rather than runtime correctness.
- src/media-understanding/runner.skip-tiny-audio.test.ts (assertion robustness), src/media-understanding/apply.test.ts (remote/URL tiny-audio coverage)
<!-- greptile_other_comments_section -->
<!-- /greptile_comment -->
Most Similar PRs
#8048: Media: add regression test for audio text blocks (#7970)
by Abhishek-B-R · 2026-02-03
80.4%
#12717: fix: add "audio" to openai provider capabilities
by openjay · 2026-02-09
79.5%
#5401: fix(media-understanding): detect audio binary by magic bytes to pre...
by RiadJamal07 · 2026-01-31
78.6%
#4235: fix(media): skip audio in extractFileBlocks + hasBinaryAudioMagic d...
by null-runner · 2026-01-29
77.7%
#14794: fix: parse inline MEDIA: tokens in agent replies
by explainanalyze · 2026-02-12
76.1%
#8014: fix(media-understanding): support legacy {file} placeholder in CLI ...
by Glucksberg · 2026-02-03
76.0%
#11160: Media: add missing audio MIME-to-extension mappings (aac, flac, opu...
by lailoo · 2026-02-07
75.7%
#7454: fix: skip UTF-16 heuristic for audio/video/image MIME types (#7444)
by gavinbmoore · 2026-02-02
75.0%
#18811: fix(media): require file extension for ambiguous MEDIA: path detection
by aldoeliacim · 2026-02-17
74.7%
#21110: fix(tts): deliver audio via structured mediaUrl instead of MEDIA: t...
by hydro13 · 2026-02-19
74.0%