#13042: feat(security): add guard model for prompt injection sanitization

by TGambit65 open 2026-02-10 02:31 View on GitHub →

docs channel: msteams gateway size: M

Cluster: Security Enhancements and Guardrails

Implements a lightweight guard model pipeline to sanitize external untrusted content (emails, web, tools) before it reaches the main agent context. - Adds `security.guardModel` configuration - Implements core sanitization logic in `src/security/guard-model.ts` - Provides true upstream isolation for dirty content - Aligns with security roadmaps for Agentic AI safety References Discussion #11130  <h2>Greptile Overview</h2> <h3>Greptile Summary</h3> This PR adds a new “guard model” sanitization pipeline intended to pre-process untrusted external content before it is passed to the primary agent. Changes include: - A new `GuardModelConfig` shape (Zod + inferred type) and a new `guardModel` field on `AgentDefaultsConfig`. - A new `sanitizeWithGuardModel` implementation under `src/security/guard-model.ts` plus unit tests. - Documentation describing the concept and configuration. The main integration point is the agent defaults schema/type plumbing, which makes the new config available across the existing config system. <h3>Confidence Score: 3/5</h3> - This PR is close, but has a definite schema import/export issue that will break config parsing until fixed. - Most additions are isolated (new module + tests + docs), but `AgentDefaultsSchema` currently imports `GuardModelConfigSchema` from a module that doesn’t export it, which will cause a runtime/build failure when the schema is loaded. There’s also an unused import in the new guard-model module that may fail CI depending on TS/ESLint settings. - src/config/zod-schema.agent-defaults.ts, src/security/guard-model.ts