#20537: feat: centralized outbound message sanitization gate

by echoVic open 2026-02-19 02:18 View on GitHub →

size: S

Cluster: Metadata Sanitization and Security Fixes

## Problem Fixes #16673 The current error sanitization is scattered across multiple per-tool and per-error-type handlers. Each new tool or API integration can introduce new leak vectors that require patching individually. This has led to multiple related issues: - #7867 — Malformed tool call errors leaked verbatim - #9951 — Context overflow errors leak to user - #11038 — Context corruption exposes raw API errors - #18937 — API error messages (401) leaked to user channel - #20004 — Internal error messages leaking to WhatsApp contacts - #20279 — PII exposed in "Media failed" error messages ## Solution Add a single `sanitizeOutboundText()` gate in `deliver.ts` at the `deliverOutboundPayloadsCore()` boundary — every `payload.text` gets sanitized before hitting any channel send function, regardless of source. ### Architecture ``` Before: Tool errors → payloads.ts (per-tool sanitization) → deliver.ts → channels After: Tool errors → payloads.ts (raw) → deliver.ts (central sanitizer) → channels ``` ### What it catches - Raw API JSON error payloads (`{"type":"error",...}`) - Context overflow / rate limit / billing / timeout errors - Cloudflare/HTML error pages - Internal `[openclaw]` system messages - Conversation metadata / PII leaks - Stack traces - HTTP errors with raw JSON bodies ### Performance A fast-path heuristic (`looksLikeLeakedContent()`) avoids running expensive regex checks on normal assistant messages — only suspicious text triggers the full sanitization pipeline.  <h3>Greptile Summary</h3> Added centralized sanitization gate in `deliverOutboundPayloadsCore()` at deliver.ts:529. Every outbound message now passes through `sanitizeOutboundText()` before hitting any channel send function, catching leaked API errors, raw JSON payloads, internal metadata, and stack traces regardless of source. Key implementation details: - Fast-path heuristic (`looksLikeLeakedContent()`) avoids expensive regex checks on normal assistant messages - Reuses existing error detection helpers from `pi-embedded-helpers/errors.ts` - Sanitization runs after `message_sending` hook but before actual channel delivery - Replaces leaked content with user-friendly error messages Found one inconsistency in the fast-path heuristic where one check operates on the untrimmed original text while others use the lowercased trimmed version. <h3>Confidence Score: 4/5</h3> - Safe to merge with one minor syntax fix - The implementation follows sound architectural principles with a single sanitization gate, reuses well-tested error detection helpers, and includes a performance optimization via fast-path heuristic. One minor inconsistency exists in the heuristic that should be fixed, but it's unlikely to cause issues in practice since trimmed text would still be caught by downstream regex checks - src/infra/outbound/sanitize-outbound.ts requires a one-line fix for the fast-path heuristic inconsistency <sub>Last reviewed commit: 64481fe</sub>