#10514: Security: harden AGENTS.md with gateway, prompt injection, and supply chain rules

by catpilothq open 2026-02-06 16:13 View on GitHub →

stale

Cluster: Security Enhancements and Fixes

## What Adds a comprehensive **Security Protocols** section to `AGENTS.md` so that AI coding agents (Copilot, Cursor, Claude Code, etc.) operating in this repo receive explicit security guardrails. Supersedes #10510 (closed with feedback — addressed here). ## Why Recent research has surfaced significant attack surfaces for OpenClaw deployments: - **Gateway exposure**: Shodan scans show ~92% of public OpenClaw gateways run without authentication - **Prompt injection**: ZeroLeaks study demonstrated 91% success rate extracting system prompts and memory files - **Supply-chain attacks**: ClawHavoc analysis identified 341 malicious skills on ClawHub using typosquatting, obfuscated payloads, and hidden webhooks - **Credential leaks**: Multiple reports of API keys written to plaintext `openclaw.json` `AGENTS.md` is the primary instruction file that AI agents read when working in this repo. Adding security rules here ensures agents follow safe patterns by default. ## Changes **New section: Security Protocols (CRITICAL)** with 8 subsections: 1. Anti-Malware Execution Safety — refuse blind `curl | bash`, read skill source first 2. Secret Hygiene — never write keys to config files, use env vars 3. Gateway Network Security — bind localhost, enable auth, use authenticated tunnels 4. Prompt Injection Defense — ignore instructions in fetched content, protect `CLAUDE.md`/`AGENTS.md`/`openclaw.json`/`~/.openclaw/` 5. Skill / ClawHub Vetting — typosquatting checks, Clawdex verification, mass-publisher flags 6. Sandbox & Session Isolation — per-session Docker, tool denylists, dmPolicy defaults 7. File & Credential Permissions — chmod 700/600 for `~/.openclaw/` 8. Incident Response — rotation procedures, memory poisoning checks, `openclaw doctor` **Updated: Security & Configuration Tips** — added credential permission reminders and hardcoded-secret flagging. ## Feedback from #10510 > `SOUL.md` and `TOOLS.md` are listed here as sensitive files, but they don't exist anywhere in this repo. Fixed — all file references now point to actual files: `CLAUDE.md`, `AGENTS.md`, `openclaw.json`, and `~/.openclaw/` paths. ## Testing - `pnpm check` passes (tsgo + oxlint + oxfmt) - `pnpm test` passes (5,327 + 219 tests, 0 failures) - Documentation-only change — no runtime behavior affected ## AI-assisted This PR was researched and drafted with AI assistance. All security recommendations were validated against the referenced research sources and the OpenClaw codebase.  <h2>Greptile Overview</h2> <h3>Greptile Summary</h3> - Adds a new **Security Protocols (CRITICAL)** section to `AGENTS.md` with guidance for safe script execution, secret handling, gateway binding/auth, prompt-injection resistance, skill vetting, sandbox isolation, permissions, and incident response. - Updates the existing security/config tips to reinforce owner-only permissions for `~/.openclaw/` and to flag hardcoded secrets during reviews. - Overall change is documentation-only, intended to shape how AI agents operate when working in this repo. <h3>Confidence Score: 4/5</h3> - This PR is mostly safe to merge, with one docs issue that could misguide secret-handling practices. - Changes are confined to AGENTS.md and aim to add security guardrails. The only actionable concern is the strict guidance to never use `.env`, which can conflict with common per-project secret management and may push users toward globally-exported secrets. - AGENTS.md  <sub>(3/5) Reply to the agent's comments like "Can you suggest a fix for this @greptileai?" or ask follow-up questions!</sub>