#20301: Security: scrub untrusted metadata from user-facing replies
channel: whatsapp-web
gateway
agents
size: S
## Summary
- add a shared scrubber to remove untrusted metadata JSON blocks from user-facing text
- apply the scrubber in `sanitizeUserFacingText` and in gateway history sanitization for user messages
- harden WhatsApp media-failure fallback copy to avoid exposing raw internal error messages
## Changes
- Added `src/shared/untrusted-metadata.ts`
- Updated `src/agents/pi-embedded-helpers/errors.ts`
- Updated `src/gateway/chat-sanitize.ts`
- Updated `src/web/auto-reply/deliver-reply.ts`
- Added/updated regression tests for all affected paths
## Verification
- `pnpm test -- src/gateway/chat-sanitize.test.ts`
- `pnpm test -- src/auto-reply/reply/reply-utils.test.ts`
- `pnpm test -- src/web/auto-reply/deliver-reply.test.ts`
- `pnpm vitest run --config vitest.e2e.config.ts src/agents/pi-embedded-helpers.sanitizeuserfacingtext.e2e.test.ts`
<!-- greptile_comment -->
<h3>Greptile Summary</h3>
Adds a shared `stripUntrustedMetadataBlocks` scrubber that removes internal metadata JSON blocks (conversation info, sender info, thread context, etc.) from user-facing text, preventing untrusted envelope metadata from leaking into replies. The scrubber is integrated into both `sanitizeUserFacingText` (for outbound reply normalization) and the gateway history sanitization path (for user messages sent to the LLM). Additionally hardens the WhatsApp media-failure fallback to use a static message instead of exposing raw `err.message`.
- New `src/shared/untrusted-metadata.ts` with `stripUntrustedMetadataBlocks` that matches all 6 metadata header types generated by `buildInboundUserContextPrefix` in `inbound-meta.ts`
- Integrated into `sanitizeUserFacingText` in `errors.ts` to strip metadata from outbound replies
- Integrated into `chat-sanitize.ts` to strip metadata from user messages before they reach the LLM context
- WhatsApp media-failure fallback now uses a static `"⚠️ Media failed. Sending text only."` instead of interpolating `err.message`
- Regression tests added across all affected paths
<h3>Confidence Score: 5/5</h3>
- This PR is safe to merge — it adds defensive stripping of internal metadata and hardens error messages with no behavioral regressions.
- The changes are focused and well-scoped: a new shared utility with clear, testable logic; straightforward integration at three call sites; and a simple hardening fix for the WhatsApp fallback. The metadata header list exactly matches the generation source in inbound-meta.ts. All edge cases (missing closing fence, multi-line JSON, residual blank lines) are handled correctly. Comprehensive regression tests cover all affected paths. No logic errors, security issues, or regressions identified.
- No files require special attention.
<sub>Last reviewed commit: fb14d8a</sub>
<!-- greptile_other_comments_section -->
<sub>(2/5) Greptile learns from your feedback when you react with thumbs up/down!</sub>
<!-- /greptile_comment -->
Most Similar PRs
#23271: fix(chat): strip untrusted metadata blocks from Control UI messages
by lbo728 · 2026-02-22
84.4%
#23312: fix(gateway): strip inbound metadata in chat history sanitization
by SidQin-cyber · 2026-02-22
83.3%
#20231: fix: strip untrusted metadata blocks from chat history
by MisterGuy420 · 2026-02-18
81.3%
#10196: fix(whatsapp): sanitize raw mention IDs in outbound messages
by koala73 · 2026-02-06
79.2%
#12325: fix: trim leading/trailing whitespace from outbound messages
by jordanstern · 2026-02-09
78.8%
#22088: fix(web): sanitize media errors to prevent PII leak
by ashiabbott · 2026-02-20
78.7%
#15395: Auto-reply: strip leaked protocol transcript lines from inbound con...
by kiranjd · 2026-02-13
77.5%
#16733: fix(ui): avoid injected newlines when tool output is hidden
by jp117 · 2026-02-15
76.2%
#8052: fix(whatsapp): strip leading whitespace from outbound messages
by FelixFoster · 2026-02-03
75.9%
#22442: test(ci): unbreak baseline tui metadata + msteams local-file assert...
by SmithLabsLLC · 2026-02-21
75.4%