← Back to PRs

#8251: fix(voice-call): remove redundant transcript from extraSystemPrompt

by geodeterra open 2026-02-03 20:06 View on GitHub →
channel: voice-call stale
## Problem The transcript was being included in `extraSystemPrompt`, but session history already contains the conversation. This caused context to be duplicated, roughly doubling token usage per turn and increasing latency for longer voice calls. ## Root Cause Analysis The voice call response generator was: 1. Building transcript as text in `extraSystemPrompt` 2. Calling `runEmbeddedPiAgent` which loads session history Both contained the same conversation, effectively doubling the context. ## Fix Remove the transcript from `extraSystemPrompt`, keeping only the voice-specific instructions. The session history already provides conversation context. ## Results **Before (with prompt cache):** - ~7,700-8,200 input tokens per turn - Growing over time as transcript accumulated **After (with prompt cache):** - ~90-160 input tokens per turn - Minimal growth (only session messages) **~98% reduction** in per-turn token usage after cache warmup. ## Testing Tested with automated Twilio calls (test number calling main number). Verified token counts in session logs showed dramatic reduction. ## Breaking Changes None. The `transcript` parameter is kept for backwards compatibility but marked as deprecated. <!-- greptile_comment --> <h2>Greptile Overview</h2> <h3>Greptile Summary</h3> This PR removes the accumulated call transcript from `extraSystemPrompt` in the voice response generator, relying on embedded Pi agent session history instead to avoid duplicated context and reduce token usage/latency. It also extends the voice-call provider interface with optional `answerCall` support and updates the Telnyx provider to (a) explicitly answer inbound calls and (b) normalize transcription payload fields, plus adds webhook-side auto-response for inbound `call.speech` events. Key issues to address before merge: - Telnyx transcription debug logging currently prints full payloads (likely includes user speech/PII) and can balloon logs. - Telnyx direction normalization defaults unknown/missing directions to `outbound`, which can misclassify inbound events. - Inbound call handling mutates `event.callId` after record creation, which can create subtle inconsistencies for retries/other consumers. - Auto-response can potentially trigger twice (streaming handler + webhook handler) without a guard/deduplication. <h3>Confidence Score: 3/5</h3> - This PR is reasonably safe but has a few correctness/logging concerns to address before merging. - Main change (removing transcript from extraSystemPrompt) is straightforward and should reduce token usage as intended, but the PR also introduces new inbound/Telnyx behaviors (answering, direction mapping, webhook auto-response) where small logic issues could cause misclassification, duplicate responses, or sensitive log leakage. - extensions/voice-call/src/providers/telnyx.ts, extensions/voice-call/src/manager.ts, extensions/voice-call/src/webhook.ts <!-- greptile_other_comments_section --> <sub>(3/5) Reply to the agent's comments like "Can you suggest a fix for this @greptileai?" or ask follow-up questions!</sub> **Context used:** - Context from `dashboard` - CLAUDE.md ([source](https://app.greptile.com/review/custom-context?memory=fd949e91-5c3a-4ab5-90a1-cbe184fd6ce8)) - Context from `dashboard` - AGENTS.md ([source](https://app.greptile.com/review/custom-context?memory=0d0c8278-ef8e-4d6c-ab21-f5527e322f13)) <!-- /greptile_comment -->

Most Similar PRs