#18852: fix: Voice-call state persistence is fire-and-forget, causing silent rec
channel: voice-call
size: L
trusted-contributor
Cluster:
Voice Call Enhancements and Fixes
## Fix Summary
Voice-call state writes are performed as non-awaited `appendFile` calls and failures are only logged. This can silently drop call state transitions and processed-event IDs used for crash recovery, leading to replayed events, stale active calls after restart, and incorrect call lifecycle behavior.
## Issue Linkage
Fixes #18851
## Security Snapshot
- CVSS v3.1: 8.1 (High)
- CVSS v4.0: 7.2 (High)
## Implementation Details
### Files Changed
- `extensions/voice-call/src/manager/events.test.ts` (+70/-1)
- `extensions/voice-call/src/manager/events.ts` (+72/-15)
- `extensions/voice-call/src/manager/outbound.test.ts` (+116/-0)
- `extensions/voice-call/src/manager/outbound.ts` (+68/-6)
- `extensions/voice-call/src/manager/store.test.ts` (+44/-0)
- `extensions/voice-call/src/manager/store.ts` (+1/-4)
- `extensions/voice-call/src/manager/timers.test.ts` (+121/-0)
- `extensions/voice-call/src/manager/timers.ts` (+16/-2)
- `extensions/voice-call/src/webhook.test.ts` (+251/-0)
- `extensions/voice-call/src/webhook.ts` (+5/-2)
### Technical Analysis
Root cause: Voice-call state writes are performed as non-awaited `appendFile` calls and failures are only logged. The vulnerable behavior originates in `extensions/voice-call/src/manager/store.ts:10`. When the documented execution path is triggered, security controls fail to enforce the expected boundary. Fix approach: enforce secure defaults on this path, validate/guard untrusted inputs at the boundary, and add regression tests that cover the failing scenario.
## Validation Evidence
- Command: `pnpm build && pnpm check && pnpm test -- extensions/voice-call/src/manager/events.test.ts extensions/voice-call/src/manager/outbound.test.ts extensions/voice-call/src/manager/timers.test.ts extensions/voice-call/src/webhook.test.ts`
- Status: passed
## Risk and Compatibility
non-breaking; no known regression impact
## AI-Assisted Disclosure
- AI-assisted: yes
- Model: GPT-5.3-Codex
<!-- greptile_comment -->
<h3>Greptile Summary</h3>
This PR fixes a data durability bug in the voice-call extension where `persistCallRecord` was a fire-and-forget async write (`fsp.appendFile` with `.catch` logging). Failures were silently swallowed, which could drop call state transitions and processed-event IDs used for crash recovery, leading to replayed events, stale active calls after restart, and incorrect call lifecycle behavior.
**Key changes:**
- `store.ts`: Replaced async `fsp.appendFile(...).catch(...)` with synchronous `fs.appendFileSync(..., { flush: true })` so persistence errors propagate to callers
- `events.ts`: Added transactional rollback in `processEvent` — snapshots call state before mutation, rolls back in-memory maps (`activeCalls`, `providerCallIdMap`) on persist failure, and defers side effects (timers, answered hook, dedup marking) until after successful persistence
- `outbound.ts`: Wrapped all `persistCallRecord` calls in try/catch with snapshot-restore rollback via `cloneCallRecord`/`restoreCallRecord`; persist-before-activate pattern in `initiateCall` prevents orphaned in-memory state
- `timers.ts`: Timer callback now catches both persistence and timeout handler errors independently; persist failure is gracefully logged and the timeout handler still executes
- `webhook.ts`: Event processing errors now propagate (re-thrown) instead of being swallowed, surfacing 500s to let providers retry webhook delivery
- New test files (`store.test.ts`, `outbound.test.ts`, `timers.test.ts`, `webhook.test.ts`) and additions to `events.test.ts` provide thorough regression coverage for rollback and retry behavior
<h3>Confidence Score: 4/5</h3>
- This PR is safe to merge — it fixes a genuine data durability issue with well-structured rollback logic and comprehensive test coverage.
- Score of 4 reflects a solid fix with thorough tests. The core change (sync persistence + error propagation + rollback) is well-implemented. Minor style concerns (variable shadowing in events.ts, duplicated cloneCallRecord helper) are not functional issues. The switch from async to sync I/O is an intentional trade-off for durability that is appropriate for the use case.
- `extensions/voice-call/src/manager/events.ts` has a `previousProviderCallId` variable shadowing issue that could cause confusion during future maintenance.
<sub>Last reviewed commit: dfa7108</sub>
<!-- greptile_other_comments_section -->
<!-- /greptile_comment -->
Most Similar PRs
#9760: fix(voice-call): discard stale calls on plugin restart
by leszekszpunar · 2026-02-05
80.4%
#19103: fix(voice-call): replace console.log with runtime logging
by Clawborn · 2026-02-17
77.7%
#4325: fix(voice-call): verify call status with provider before loading st...
by garnetlyx · 2026-01-30
77.5%
#7652: fix(voice-call): fix Telnyx transcription (STT) not working
by tturnerdev · 2026-02-03
76.6%
#6702: fix(voice-call): mark calls as ended when media stream disconnects
by johngnip · 2026-02-01
75.9%
#7704: fix(voice-call): add authentication to WebSocket media stream endpoint
by coygeek · 2026-02-03
75.7%
#21566: feat(voice-call): bridge call transcripts to main agent session
by MegaPhoenix92 · 2026-02-20
75.7%
#21101: Security/Voice Call: gate skipSignatureVerification to local-dev or...
by bmendonca3 · 2026-02-19
74.8%
#20770: fix: prevent stale session-entry overwrite during reset-model persi...
by coygeek · 2026-02-19
74.5%
#21197: Security/Voice Call: enforce exact webhook path matching
by bmendonca3 · 2026-02-19
74.4%