#18077: fix: deduplicate TTS audio delivered via tool results
agents
stale
size: S
Cluster:
Voice Call and TTS Improvements
## Problem
When the TTS tool returns a `MEDIA:` path, the audio is delivered immediately from the tool result. However, the model often echoes the `MEDIA:` line in its follow-up assistant message (encouraged by the old "Copy the MEDIA line exactly" instruction), causing the same audio file to be sent to the user twice.
Closes #17991
## Root Cause
Two issues:
1. **TTS tool description** explicitly instructs the model to "Copy the MEDIA line exactly" — directly causing the duplicate
2. **`filterMessagingToolDuplicates`** only deduplicates text content, not media paths — so even if the model echoes the MEDIA path, there's no safety net
## Fix
Two-layer approach:
### Layer 1: Prevent (tool description)
Updated TTS tool description to tell the model the audio is delivered automatically and NOT to repeat the MEDIA line.
### Layer 2: Catch (media path dedup)
Added `toolResultMediaPaths` tracking through the full pipeline:
- `pi-embedded-subscribe` state tracks media paths extracted from tool results
- Paths flow through: subscribe → attempt result → run result → dedup callsites
- `filterMessagingToolDuplicates` now accepts optional `sentMediaPaths` and drops payloads whose only content is a duplicate media path
- Payloads with duplicate media but meaningful text are kept (only media stripped conceptually)
### Files Changed
- `tts-tool.ts` — updated description
- `pi-embedded-subscribe.handlers.tools.ts` — track delivered media paths in both `emitToolOutput` and direct `onToolResult` paths
- `pi-embedded-subscribe.handlers.types.ts` — added `toolResultMediaPaths` to state type
- `pi-embedded-subscribe.ts` — init + reset + expose getter
- `run/attempt.ts` — destructure + pass through
- `run/types.ts` — added to attempt result type
- `run.ts` — pass through to run result (both code paths)
- `pi-embedded-runner/types.ts` — added to run result type
- `agent-runner.ts` — pass to `buildReplyPayloads`
- `agent-runner-payloads.ts` — accept + forward to dedup
- `followup-runner.ts` — forward to dedup
- `reply-payloads.ts` — enhanced `filterMessagingToolDuplicates`
- `reply-payloads.media-dedup.test.ts` — 7 new tests
## Testing
- 7 new unit tests covering media path dedup (case insensitivity, text+media combos, backward compat, whitespace-only text)
- All 389 existing tests pass with zero regressions
<!-- greptile_comment -->
<h3>Greptile Summary</h3>
This PR fixes duplicate TTS audio delivery with a well-structured two-layer approach: (1) updating the TTS tool description to stop telling the model to echo `MEDIA:` lines, and (2) adding a dedup safety net that tracks media paths delivered via tool results and filters them out of follow-up assistant messages.
- **Layer 1 (Prevention):** The TTS tool description in `tts-tool.ts` is updated to tell the model the audio is delivered automatically, removing the "Copy the MEDIA line exactly" instruction that was directly causing duplicates.
- **Layer 2 (Safety net):** `toolResultMediaPaths` tracking is plumbed through the full pipeline: subscribe state → attempt result → run result → `buildReplyPayloads` → `filterMessagingToolDuplicates`. The dedup function now accepts optional `sentMediaPaths` and drops payloads whose only content is a duplicate media path (case-insensitive matching). Payloads with meaningful text alongside duplicate media are preserved.
- **Testing:** 7 new unit tests cover the key dedup scenarios. The fixture file is updated for type compliance.
- **Minor gap:** The dedup only checks `mediaUrl` (singular), not `mediaUrls` (plural). Since `mediaUrl` is always set to the first element when `mediaUrls` is present, this covers the primary TTS use case but wouldn't catch secondary media paths in multi-media payloads.
<h3>Confidence Score: 4/5</h3>
- This PR is safe to merge with minimal risk — changes are additive and backward-compatible, with good test coverage.
- Score of 4 reflects: clean two-layer approach with both prevention and safety net, consistent plumbing through a complex pipeline, 7 new tests with zero regressions, and backward compatibility (sentMediaPaths is optional with default []). Docked one point for the minor mediaUrls (plural) dedup gap, though it doesn't affect the primary TTS use case.
- `src/auto-reply/reply/reply-payloads.ts` — the core dedup logic only checks `mediaUrl` (singular), not `mediaUrls` (plural), which could matter for future multi-media tool results.
<sub>Last reviewed commit: 3bc21bb</sub>
<!-- greptile_other_comments_section -->
<!-- /greptile_comment -->
Most Similar PRs
#21513: Agents: track TTS media in duplicate filter state
by DevvGwardo · 2026-02-20
87.6%
#21110: fix(tts): deliver audio via structured mediaUrl instead of MEDIA: t...
by hydro13 · 2026-02-19
82.1%
#20735: fix: skip auto-attaching tool MEDIA: paths already sent via message t…
by anillBhoi · 2026-02-19
81.2%
#13501: fix: extend assistant text dedup across message boundaries
by shakir-abdo · 2026-02-10
78.4%
#19399: telegram: fix MEDIA false positives and partial final drop
by HOYALIM · 2026-02-17
78.1%
#19439: fix(tts): pass audioAsVoice flag through tool result pipeline
by brandonwise · 2026-02-17
78.1%
#21276: fix(telegram): stabilize partial finalization and MEDIA dedupe (AI-...
by HOYALIM · 2026-02-19
77.7%
#18890: fix(media): parse tool-result MEDIA directives with shared parser
by teededung · 2026-02-17
76.6%
#19868: fix: prevent media token regex from matching markdown bold text
by sanketgautam · 2026-02-18
74.0%
#20992: fix(tts): apply TTS processing to agentCommand outbound delivery path
by mmyyfirstb · 2026-02-19
73.6%