#5499: fix(voice-call): wait for session creation before sending config update

by lailoo open 2026-01-31 15:22 View on GitHub →

channel: voice-call agents

Cluster: Voice Call and TTS Improvements

Closes #5243 The OpenAI Realtime API requires waiting for the transcription_session.created event before sending transcription_session.update. Previously, the update was sent immediately on WebSocket open, causing 'Missing required parameter: session' errors. Changes: - Wait for transcription_session.created before sending session config - Track sessionReady state to prevent sending audio before session is configured - Add pendingSessionConfig flag to handle the initialization sequence - Remove DefaultResourceLoader usage (no longer exported from pi-coding-agent)  <h2>Greptile Overview</h2> <h3>Greptile Summary</h3> This PR updates the OpenAI Realtime STT provider to follow the Realtime API’s init handshake by deferring `transcription_session.update` until after `transcription_session.created`, and gates audio sends until the session is configured. It also updates the embedded runner codepaths to stop constructing a `DefaultResourceLoader` (no longer exported) and instead pass `additionalExtensionPaths` and the computed `systemPrompt` directly into `createAgentSession`. <h3>Confidence Score: 4/5</h3> - Generally safe to merge; main remaining risk is an edge-case where the session may never become ready and audio is silently dropped. - The changes are localized and align with the Realtime API’s required event ordering, and the embedded runner refactor matches the local type augmentation for `additionalExtensionPaths`. The primary concern is that readiness is now contingent on receiving `transcription_session.updated`; if that event is missed or fails, the session stays non-ready and audio is dropped without surfacing an error. - extensions/voice-call/src/providers/stt-openai-realtime.ts