← Back to PRs

#4325: fix(voice-call): verify call status with provider before loading stale calls

by garnetlyx open 2026-01-30 02:44 View on GitHub →
channel: voice-call
## Problem When the gateway restarts, `loadActiveCalls()` reloads non-terminal calls from `calls.jsonl`. However, these calls may have already ended (e.g., Twilio timed them out, or webhook couldn't reach local URL) and are now stale. This causes the concurrent call limit to be reached with phantom calls. ## Solution - Add `getCallStatus()` method to `VoiceCallProvider` interface - Implement `getCallStatus()` for all providers (Twilio, Plivo, Telnyx, Mock) - On load, verify each non-terminal call with the provider before adding to `activeCalls` - Skip calls that the provider reports as terminal (completed, failed, etc.) - Also skip calls older than `maxDurationSeconds` as a fallback This is an improvement over PR #2810 which only uses time-based cleanup. By querying the provider, we can accurately determine if a call is still active. ## Testing 1. Initiate a call 2. Kill the gateway before the call ends (or let Twilio timeout without webhook) 3. Restart the gateway 4. Verify that stale calls are logged as skipped (provider status check) 5. Verify that new calls can be initiated <!-- greptile_comment --> <h2>Greptile Overview</h2> <h3>Greptile Summary</h3> This PR adds a `getCallStatus()` method to the `VoiceCallProvider` interface and implements it across providers (Twilio/Plivo/Telnyx/Mock). On gateway startup, `CallManager.initialize()` now restores persisted non-terminal calls from `calls.jsonl` only after verifying with the provider that each call is still active, plus a time-based fallback using `maxDurationSeconds`. The runtime and tests were updated to await the async initialization. Overall, this fits well into the existing crash-recovery flow (`calls.jsonl` restore) and directly addresses phantom calls consuming the concurrency limit after a restart. <h3>Confidence Score: 3/5</h3> - Mostly safe to merge, but there are a couple of edge cases that can cause incorrect call restoration behavior after restart. - Core change (provider-side verification before restoring persisted calls) is sound and reduces phantom-call risk, but error-handling currently treats many provider lookup failures/unknowns as terminal which can drop still-active calls, and restored calls may miss max-duration hangup timers, allowing long-lived activeCalls after restart. - extensions/voice-call/src/manager.ts, extensions/voice-call/src/providers/{twilio,plivo,telnyx}.ts <!-- greptile_other_comments_section --> <sub>(4/5) You can add custom instructions or style guidelines for the agent [here](https://app.greptile.com/review/github)!</sub> <!-- /greptile_comment -->

Most Similar PRs