#9092: fix: skip retry when block content already streamed to user

by benleavett open 2026-02-04 21:25 View on GitHub →

agents stale

Cluster: Telegram Message Handling Fixes

## Summary - Tracks whether block replies were emitted during an LLM attempt via new `didEmitBlockReply` state flag - Skips auth profile rotation retry when content was already delivered to the user - Prevents duplicate responses when LLM returns error after streaming content (e.g., Gemini 500 INTERNAL after complete response) ## Problem When Gemini (and potentially other providers) streams a complete response but then returns a 500 error on stream close, the retry logic would rotate auth profiles and re-run the prompt, causing the same response to be delivered multiple times. ## Solution Track when block replies are emitted during streaming via `didEmitBlockReply`, and check this flag before triggering retry. If content was already streamed to the user, skip the retry to avoid duplicate messages. ## Test plan - [x] Add unit test: "skips retry when block content was already streamed" - [x] All existing auth profile rotation tests pass (9 tests) 🤖 Generated with [Claude Code](https://claude.com/claude-code)  <h2>Greptile Overview</h2> <h3>Greptile Summary</h3> This PR adds a new per-attempt flag (`didStreamBlockReply`) derived from subscription state (`didEmitBlockReply`) to detect whether block replies were streamed during an LLM attempt. `runEmbeddedPiAgent` then uses that flag to skip auth-profile rotation retries when a provider returns an error after content has already been streamed, preventing duplicate responses (notably for Gemini errors on stream close). The change threads through `subscribeEmbeddedPiSession` → `runEmbeddedAttempt` → `runEmbeddedPiAgent`, and includes a unit test covering the “streamed content + terminal error should not retry” scenario. <h3>Confidence Score: 3/5</h3> - This PR is close to safe to merge, but the new retry guard can be bypassed in real streaming configurations. - Core logic is small and well-targeted, but `didStreamBlockReply` currently reflects `onBlockReply` invocation rather than “content already delivered”, so the guard may fail to prevent duplicates in some streaming paths. Tests in this environment couldn’t be executed due to missing dependencies, so confidence relies on static review. - src/agents/pi-embedded-subscribe.ts, src/agents/pi-embedded-runner/run.ts  <sub>(3/5) Reply to the agent's comments like "Can you suggest a fix for this @greptileai?" or ask follow-up questions!</sub> **Context used:** - Context from `dashboard` - CLAUDE.md ([source](https://app.greptile.com/review/custom-context?memory=fd949e91-5c3a-4ab5-90a1-cbe184fd6ce8)) - Context from `dashboard` - AGENTS.md ([source](https://app.greptile.com/review/custom-context?memory=0d0c8278-ef8e-4d6c-ab21-f5527e322f13))