#23557: fix: catch session JSONL write failures instead of crashing
agents
size: M
Cluster:
Session Lock Improvements
Closes #23556
## Summary
- Introduces a `safeAppend` helper in `installSessionToolResultGuard` that wraps `originalAppend` in a try/catch
- Replaces all three bare `originalAppend` call sites with `safeAppend`
- On write failure, logs `[session-guard] session write failed (<reason>); message dropped (file: <path>)` to stderr and returns gracefully instead of crashing
## Motivation
Under Colima with virtiofs mounts, the session JSONL file can transiently become unwritable (`EACCES`). The existing code lets the thrown error propagate uncaught, which crashes the gateway with `exit(1)`. This is a disproportionate response to a non-fatal I/O condition — the active conversation can continue even if one history entry is lost.
## Test plan
- [ ] New e2e tests in `session-tool-result-guard.e2e.test.ts` cover all three `safeAppend` call paths (flush, persisted, finalMessage) with a mock `originalAppend` that throws
- [ ] Verify gateway stays running when `originalAppend` throws — error is logged, not propagated
- [ ] Existing tests pass unchanged
<!-- greptile_comment -->
<h3>Greptile Summary</h3>
Wraps session JSONL writes in try-catch via `safeAppend` helper to prevent gateway crashes from transient I/O errors (like `EACCES` under Colima virtiofs). All three `originalAppend` call sites now use `safeAppend`, which logs write failures to stderr and returns gracefully instead of propagating the exception.
- Main change: introduces `safeAppend` wrapper (lines 153-168) that catches exceptions from `originalAppend`
- Error handling: logs `[session-guard] session write failed (<reason>); message dropped (file: <path>)` and returns `undefined`
- Test coverage: new tests verify all three call paths (flush, tool result, final message) with mocked write failures
- Impact: conversation continues even if session history entry is lost due to I/O error
<h3>Confidence Score: 4/5</h3>
- This PR is safe to merge with one minor concern about type safety
- The implementation correctly wraps all append calls with error handling and has comprehensive test coverage. The approach is sound - continuing the conversation when a single history write fails is more graceful than crashing. However, there's a type-unsafe cast (`undefined as unknown as ReturnType<typeof originalAppend>`) that could potentially cause issues if callers depend on the return value, though the existing code paths show `undefined` is already a valid return value in some scenarios.
- No files require special attention - the implementation is straightforward and well-tested
<sub>Last reviewed commit: 7b67c26</sub>
<!-- greptile_other_comments_section -->
<sub>(3/5) Reply to the agent's comments like "Can you suggest a fix for this @greptileai?" or ask follow-up questions!</sub>
<!-- /greptile_comment -->
Most Similar PRs
#23583: fix(agents): catch session JSONL write failures instead of crashing
by mohandshamada · 2026-02-22
93.4%
#21828: fix: acquire session write lock in delivery mirror and gateway chat...
by inkolin · 2026-02-20
77.1%
#12296: security: persistence-only secret redaction for session transcripts
by akoscz · 2026-02-09
76.7%
#15649: fix: sanitize tool_use IDs on session write path
by aldoeliacim · 2026-02-13
76.0%
#15050: fix: transcript corruption resilience — strip aborted tool_use bloc...
by yashchitneni · 2026-02-12
76.0%
#12260: fix: redact secrets in tool results before persisting to session tr...
by Yida-Dev · 2026-02-09
75.8%
#10915: fix: prevent session bloat from oversized tool results and improve ...
by DukeDeSouth · 2026-02-07
75.5%
#9011: fix(session): auto-recovery for corrupted tool responses [AI-assisted]
by cheenu1092-oss · 2026-02-04
75.3%
#3647: fix: sanitize tool arguments in session history
by nhangen · 2026-01-29
75.2%
#16061: fix(sessions): tolerate invalid sessionFile metadata
by haoyifan · 2026-02-14
75.0%