#8197: [AI-Assisted] feat: Add "Hardball" Security Framework (MFA-protected Agent Integrity)
docs
stale
Cluster:
Security Enhancements and Fixes
## 🛡️ Proposal: The "Hardball" Framework for Agent Integrity
This PR introduces a comprehensive security architecture called **"Hardball"** to the OpenClaw documentation.
### Why it matters
Autonomous agents are vulnerable to prompt injection and identity hijacking. While model-level resistance is important, system-level governance is crucial. The Hardball Framework implements **Human-in-the-loop MFA** for critical system modifications.
### Key Features:
- **Dual-Layer Defense:** Separates security "Instincts" (`SOUL.md`) from operational "Playbooks" (`SECURITY.md`).
- **Hardened MFA Standards:** Defines ephemeral, RAM-only OTP protocols to prevent persistent credential leaks.
- **Camouflage Policy:** Best practices for brief, non-descriptive refusals to mitigate logic exfiltration.
- **Native Integration:** Shows how to implement this using OpenClaw's existing environment variables and Markdown-based persona architecture.
This framework has been implemented and tested in a production-like environment, demonstrating how OpenClaw can reach enterprise-grade safety without core code modifications.
---
**AI-Assisted:** This proposal was developed collaboratively with an OpenClaw agent.
**Testing:** Fully tested implementation using Gmail SMTP for MFA delivery and strict SOUL/SECURITY playbooks.
<!-- greptile_comment -->
<h2>Greptile Overview</h2>
<h3>Greptile Summary</h3>
Adds documentation for a proposed “Hardball” agent-integrity framework and links it from `SECURITY.md`.
- `SECURITY.md` gains a new Operational Guidance link pointing to the Hardball doc.
- `docs/concepts/hardball-security.md` introduces a two-tier “SOUL.md + SECURITY.md” governance model and an MFA/OTP flow example (Gmail SMTP) for “vital file” changes.
Overall this fits into the existing security docs, but a few doc-style and accuracy issues (internal link format, branding consistency, and questionable CVE references) should be addressed so the guidance is reliable and renders correctly on Mintlify.
<h3>Confidence Score: 3/5</h3>
- Safe to merge from a runtime perspective, but documentation correctness/style issues should be fixed first.
- Changes are documentation-only, so they won’t affect runtime behavior; however, the new guidance includes style-guide violations (Mintlify link format, branding) and potentially incorrect CVE references that could mislead users.
- SECURITY.md, docs/concepts/hardball-security.md
<!-- greptile_other_comments_section -->
<sub>(2/5) Greptile learns from your feedback when you react with thumbs up/down!</sub>
**Context used:**
- Context from `dashboard` - CLAUDE.md ([source](https://app.greptile.com/review/custom-context?memory=fd949e91-5c3a-4ab5-90a1-cbe184fd6ce8))
- Context from `dashboard` - AGENTS.md ([source](https://app.greptile.com/review/custom-context?memory=0d0c8278-ef8e-4d6c-ab21-f5527e322f13))
<!-- /greptile_comment -->
Most Similar PRs
#10514: Security: harden AGENTS.md with gateway, prompt injection, and supp...
by catpilothq · 2026-02-06
82.8%
#7983: feat(security): add secure coding guidelines to system prompt
by TGambit65 · 2026-02-03
77.6%
#7346: Security: add hardening module and secure-bot extension
by AlphonseC · 2026-02-02
77.3%
#15583: docs: Autonomous Governance Framework for bot ecosystem
by Insider77Circle · 2026-02-13
77.1%
#15757: feat(security): add hardening gap audit checks
by saurabhsh5 · 2026-02-13
76.6%
#15122: feat(docs): add CLAWS.md capability contract standard
by igindin · 2026-02-13
76.2%
#8086: feat(security): Add prompt injection guard rail
by bobbythelobster · 2026-02-03
75.6%
#7892: Claude/setup agent firewall ww xsv
by starwreckntx · 2026-02-03
75.4%
#8821: Security: Holistic capability-based sandbox (replaces pattern-match...
by tonioloewald · 2026-02-04
74.6%
#19500: Custom rust ultimate rewrite
by adybag14-cyber · 2026-02-17
74.2%