#8048: Media: add regression test for audio text blocks (#7970)
stale
Cluster:
Media Handling Improvements
#### fixes https://github.com/openclaw/openclaw/issues/7970
### Problem
Audio files (e.g., OGG voice messages) can be incorrectly included as <file mime="text/plain"> blocks in the message body when looksLikeUtf8Text() returns true for compressed audio files. Some OGG files have enough bytes in the printable ASCII range (32-126) to pass the >85% threshold, causing binary content to be sent to the model as text, wasting tokens and causing confusion.
The root cause was in extractFileBlocks: the check if (!forcedTextMimeResolved && kind === "audio" && !textLike) allowed audio files that "looked like" text to fall through to file extraction, even though audio transcription handles audio separately.
### Solution
Added a regression test that ensures audio files are never treated as text file blocks, regardless of their binary content. The test verifies that an OGG file with CSV-like bytes (which would pass looksLikeUtf8Text()) is correctly skipped and does not produce a <file> block.
Note: The production code in apply.ts already implements the correct behavior (skipping audio/image/video unless explicitly forced to text by filename/path). This PR adds test coverage to prevent regressions.
### File Changes
- src/media-understanding/apply.test.ts: Added test case "never treats text-like audio as a file block when audio understanding is disabled" that exercises the exact scenario from issue #7970 and asserts audio files are never included as text blocks.
<!-- greptile_comment -->
<h2>Greptile Overview</h2>
<h3>Greptile Summary</h3>
Adds a regression test in `src/media-understanding/apply.test.ts` to ensure audio attachments (e.g., OGG) are never emitted as `<file mime="text/plain">` blocks even if their bytes “look like” UTF-8 text, specifically when audio understanding is disabled (issue #7970). This strengthens the media-understanding pipeline by locking in the intended behavior: binary media (audio/image/video) should be skipped from file-block extraction unless explicitly forced to text by name/path heuristics.
<h3>Confidence Score: 4/5</h3>
- This PR is safe to merge; it only adds a targeted regression test.
- Change is isolated to a new test case and matches existing behavior expectations; the only notable issue is ongoing temp directory cleanup in tests, which can create artifacts over time but doesn’t affect production code.
- src/media-understanding/apply.test.ts (temp dir cleanup pattern)
<!-- greptile_other_comments_section -->
<sub>(2/5) Greptile learns from your feedback when you react with thumbs up/down!</sub>
<!-- /greptile_comment -->
Most Similar PRs
#5401: fix(media-understanding): detect audio binary by magic bytes to pre...
by RiadJamal07 · 2026-01-31
84.2%
#4235: fix(media): skip audio in extractFileBlocks + hasBinaryAudioMagic d...
by null-runner · 2026-01-29
83.3%
#8388: fix(media): auto-skip tiny/empty audio files before transcription (...
by Glucksberg · 2026-02-04
80.4%
#11443: LINE: fix buffer guards in detectContentType + add tests
by MdRahmatUllah · 2026-02-07
78.0%
#14794: fix: parse inline MEDIA: tokens in agent replies
by explainanalyze · 2026-02-12
77.6%
#19868: fix: prevent media token regex from matching markdown bold text
by sanketgautam · 2026-02-18
76.6%
#7454: fix: skip UTF-16 heuristic for audio/video/image MIME types (#7444)
by gavinbmoore · 2026-02-02
76.4%
#18811: fix(media): require file extension for ambiguous MEDIA: path detection
by aldoeliacim · 2026-02-17
76.2%
#11160: Media: add missing audio MIME-to-extension mappings (aac, flac, opu...
by lailoo · 2026-02-07
75.7%
#12717: fix: add "audio" to openai provider capabilities
by openjay · 2026-02-09
75.4%