#22088: fix(web): sanitize media errors to prevent PII leak

by ashiabbott open 2026-02-20 18:00 View on GitHub →

channel: whatsapp-web size: S

Cluster: Metadata Sanitization and Security Fixes

## Summary - **Problem:** Media fetch failures could bubble up raw error strings into user-facing replies, leaking sensitive metadata (phone numbers, WhatsApp JIDs, filesystem paths, etc.). - **Why it matters:** This is a privacy/security issue—errors should never expose identifiers or internal paths to end users. - **What changed:** Added sanitizeMediaError() and routed media failure messaging through it. The sanitizer extracts safe, known reasons (timeout/404/etc.), redacts messages matching common PII patterns (phone numbers, JIDs, paths), and truncates overly-long messages. - **What did NOT change:** Media fetching behavior itself; this only affects user-facing error text. ## Change Type (select all) - [x] Bug fix - [x] Security hardening ## Scope (select all touched areas) - [x] Integrations - [x] Security hardening ## Linked Issue/PR - Closes #20279 ## User-visible / Behavior Changes User-facing replies will now show a **sanitized** media failure reason instead of potentially leaking raw metadata. ## Security Impact (required) - New permissions/capabilities? No - Secrets/tokens handling changed? No - New/changed network calls? No - Command/tool execution surface changed? No - Data access scope changed? No ## Repro + Verification ### Steps 1. Trigger a media fetch failure that includes raw metadata (e.g., WhatsApp identifiers or a local path) in the thrown error 2. Observe the user-facing reply ### Expected Reply contains a generic/safe reason without identifiers/paths. ### Actual (before fix) Reply could include raw phone numbers, group JIDs, or local filesystem paths. ## Evidence - [x] Code review of sanitizer logic ## Human Verification (required) - Verified: sanitizer covers common patterns (phone numbers, JIDs, paths) and truncation - Not verified: end-to-end live run against every channel ## Compatibility / Migration - Backward compatible? Yes - Config/env changes? No - Migration needed? No ## Failure Recovery (if this breaks) Revert the sanitizeMediaError() addition and restore prior message formatting in src/web/auto-reply/deliver-reply.ts. ## Risks and Mitigations - Risk: false positives could over-redact legitimate error messages. - Mitigation: redaction is limited to clear patterns and fallback truncation; it is safer to over-redact than leak identifiers.  <h3>Greptile Summary</h3> This PR adds a `sanitizeMediaError()` function to prevent PII leakage in media fetch error messages displayed to users. The implementation extracts safe error reasons (timeouts, HTTP codes) and redacts patterns matching phone numbers, WhatsApp JIDs, file paths, and JSON dumps. **Key changes:** - Added regex-based sanitization for error messages before showing them to users - Extracts known-safe error types (timeout, 404, etc.) when present - Redacts messages containing phone numbers, JIDs, Windows/Unix paths, or JSON - Truncates messages over 80 characters as a safety fallback - Applied sanitization at line 231 where media errors are formatted for user display **Issues found:** - Phone number pattern `/\+\d{7,15}/` misses common formats with spaces, dashes, or `00` prefix - Unix path pattern `/\/home\//` only catches `/home/`, missing `/var/`, `/tmp/`, `/Users/`, etc. - Safe reason extraction (lines 40-42) returns first match which could still contain PII context (e.g., "timeout" in "timeout for user +1234567890") - JSON pattern requires 20+ chars after `{`, allowing shorter JSON with PII through - No unit tests for `sanitizeMediaError()` - security-critical code should have comprehensive test coverage <h3>Confidence Score: 3/5</h3> - This PR improves security but has gaps in PII pattern coverage that could allow leaks - The sanitization logic is a good security improvement but has several pattern-matching gaps (phone formats, path coverage, safe reason extraction context). The issues are logical flaws that could allow PII to slip through in specific error scenarios. No tests verify the security-critical sanitization logic. Score of 3 reflects meaningful security improvement with notable gaps requiring fixes before full confidence. - Pay close attention to `src/web/auto-reply/deliver-reply.ts` - the PII redaction patterns need strengthening and test coverage is required <sub>Last reviewed commit: c98730f</sub>  <sub>(3/5) Reply to the agent's comments like "Can you suggest a fix for this @greptileai?" or ask follow-up questions!</sub>