#9026: fix(session-memory): sanitize content to prevent binary data in memory files
stale
Cluster:
Error Handling in Agent Tools
## Problem
When session content contains embedded file attachments (audio, images) via `<file>` tags, the binary data was being written directly to memory files, causing:
1. **Massive file sizes** (500KB+ for a simple session)
2. **Context overflow** when the files were later read
3. **Corrupted memory files** with invalid UTF-8
### Root Cause
The `session-memory` hook extracts conversation content from session JSONL files and writes it to `memory/*.md` files. When voice messages or images were processed, their binary data was embedded in `<file>` tags and passed through unfiltered.
## Solution
Added `sanitizeForMemory()` function that strips:
- `<file>...</file>` tags (embedded audio/image binary data)
- Base64 image data URIs (`data:image/...`)
- Long base64-like sequences (>500 chars)
- Control characters (except newline/tab)
The sanitization runs **before** writing to memory files, preserving readable conversation text while removing binary blobs.
## Testing
- Added 8 new unit tests for `sanitizeForMemory()`
- All 17 tests in `handler.test.ts` pass
## Related
This is related to #3160 (`sessions_list` image stripping) - fixes the same class of bug in a different code path:
- **#3160**: `sessions_list` tool output → fixed with `stripImageData()`
- **This PR**: `session-memory` hook → fixed with `sanitizeForMemory()`
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
<!-- greptile_comment -->
<h2>Greptile Overview</h2>
<h3>Greptile Summary</h3>
This PR adds a `sanitizeForMemory()` step to the `session-memory` hook to strip embedded `<file>...</file>` blobs, base64 image data URIs, long base64-like sequences, and control characters before writing session content into `memory/*.md`. It also adds unit tests covering these sanitization behaviors.
This fits into the existing `session-memory` flow by sanitizing the extracted recent user/assistant message text right after `getRecentSessionContent()` and before slug generation + memory file write, preventing large/corrupted memory files when sessions contain attachment payloads.
<h3>Confidence Score: 4/5</h3>
- Mostly safe to merge, but needs a small type/contract fix in sanitizeForMemory.
- Core sanitization logic is localized and covered by new tests, but the current implementation/test suite codifies returning nullish values despite a `string` return type, which can introduce type-unsound behavior for future callers.
- src/hooks/bundled/session-memory/handler.ts; src/hooks/bundled/session-memory/handler.test.ts
<!-- greptile_other_comments_section -->
<sub>(3/5) Reply to the agent's comments like "Can you suggest a fix for this @greptileai?" or ask follow-up questions!</sub>
**Context used:**
- Context from `dashboard` - CLAUDE.md ([source](https://app.greptile.com/review/custom-context?memory=fd949e91-5c3a-4ab5-90a1-cbe184fd6ce8))
- Context from `dashboard` - AGENTS.md ([source](https://app.greptile.com/review/custom-context?memory=0d0c8278-ef8e-4d6c-ab21-f5527e322f13))
<!-- /greptile_comment -->
Most Similar PRs
#10591: feat(hooks): add session-start-memory bundled hook
by morningstar-daemon · 2026-02-06
80.1%
#14243: fix: fire session-memory hook on auto-resets + topic-aware memory p...
by TheDude135 · 2026-02-11
79.8%
#18103: fix: session-memory hook reads reset file after /new or /reset
by MisterGuy420 · 2026-02-16
78.0%
#10786: fix: strip thinking signatures from sessions_list and add includeTh...
by 1kuna · 2026-02-07
77.9%
#3647: fix: sanitize tool arguments in session history
by nhangen · 2026-01-29
77.7%
#6858: feat(hooks): improve session-memory with LLM summarization
by alauppe · 2026-02-02
77.4%
#3392: fix(hooks): remove debug console.log statements from session-memory...
by WinJayX · 2026-01-28
77.1%
#14576: Fix/memory loss bugs
by ENCHIGO · 2026-02-12
77.0%
#8172: fix(sessions_list): strip base64 image data to prevent context over...
by Flamrru · 2026-02-03
76.8%
#20526: fix(session-memory): search state-dir sessions path when workspace ...
by 7Sageer · 2026-02-19
75.6%