#23761: fix: suppress partial NO_REPLY tokens at lifecycle boundary
gateway
size: XS
## Problem
When the model streams `NO_REPLY` as multiple tokens (e.g. `"NO"` then `"_REPLY"`), the `lifecycle:end` event can fire while the buffer only contains the partial prefix. `isSilentReplyText()` returns `false` for `"NO"`, causing it to leak to connected clients as a real assistant message.
This is particularly visible on node clients (VS Code extensions, mobile nodes) that display every `chat-final` message as a bubble — users see random `"NO"` or `"NO_"` messages appear.
## Fix
Add a partial-prefix check in `emitChatFinal`: if the buffered text is a strict prefix of `SILENT_REPLY_TOKEN`, is shorter than the full token, and contains only `[A-Z_]` characters, treat it as a silent reply and suppress it.
The check is deliberately conservative:
- Minimum 2 characters (avoids matching single letters)
- Must be strictly shorter than the full token (full match already handled by `isSilentReplyText`)
- Only uppercase letters and underscores (won't match real messages)
## Testing
Tested in production with a VS Code node client (Pawr) over multiple days. No more leaked partial tokens.
Fixes #3340
<!-- greptile_comment -->
<h3>Greptile Summary</h3>
Adds logic to suppress partial `NO_REPLY` tokens that leak when the model streams the token across multiple chunks and `lifecycle:end` fires before the complete token arrives. The fix checks if buffered text is a strict prefix of `SILENT_REPLY_TOKEN` before emitting `chat-final` events.
- Prevents partial tokens like `"NO"` or `"NO_"` from appearing as assistant messages in client UIs
- Conservative approach with minimum 2-char length and character class validation
- Complements existing `isSilentReplyText()` which handles complete tokens
<h3>Confidence Score: 4/5</h3>
- Safe to merge - addresses real production bug with conservative logic
- The fix correctly addresses the partial token leak issue with appropriate boundary checks. Logic has been tested in production. Minor style improvement suggested on regex flag but doesn't affect correctness
- No files require special attention
<sub>Last reviewed commit: abe3c10</sub>
<!-- greptile_other_comments_section -->
<sub>(3/5) Reply to the agent's comments like "Can you suggest a fix for this @greptileai?" or ask follow-up questions!</sub>
<!-- /greptile_comment -->
Most Similar PRs
#21462: fix(agents): hold back partial NO_REPLY token in pi-embedded streaming
by algal · 2026-02-20
85.0%
#19576: fix: tighten isSilentReplyText to match whole-text only
by aldoeliacim · 2026-02-18
82.8%
#19648: fix: suppress silent-reply partial tokens during streaming
by bradleypriest · 2026-02-18
81.5%
#19916: fix: strict silent-reply detection to prevent false positives with ...
by hayoial · 2026-02-18
81.3%
#16361: Gateway: suppress NO_REPLY in webchat
by shadril238 · 2026-02-14
80.9%
#8493: fix(tui): filter NO_REPLY token from chat display
by gavinbmoore · 2026-02-04
80.8%
#15118: Fix webchat ghost bubble when model replies with NO_REPLY
by jwchmodx · 2026-02-13
79.7%
#8334: fix(webchat): Filter NO_REPLY messages from chat history
by vishaltandale00 · 2026-02-03
77.1%
#4495: Fix: emit final assistant event when reply tags hide stream
by ukeate · 2026-01-30
76.5%
#16733: fix(ui): avoid injected newlines when tool output is hidden
by jp117 · 2026-02-15
75.7%