#13894: feat(security): add manifest scanner for SKILL.md trust analysis
agents
stale
## Summary
Adds a new `manifest-scanner` module that complements the existing `skill-scanner` (JS/TS code analysis) with **content-level scanning of SKILL.md, AGENTS.md, and CLAUDE.md** files.
Threat taxonomy and detection patterns adapted from **[AgentVerus Scanner](https://github.com/agentverus/agentverus-scanner)** (MIT license) — a comprehensive skill trust scoring system with 6 analysis categories and social reputation.
## Problem
The existing skill-scanner ([PR #9806](https://github.com/openclaw/openclaw/pull/9806)) catches dangerous patterns in **executable code** (eval, child_process, exfiltration). But many attack vectors described in [Issue #11014](https://github.com/openclaw/openclaw/issues/11014) and [Cisco's research](https://blogs.cisco.com/ai/personal-ai-agents-like-openclaw-are-a-security-nightmare) target the **manifest/instruction text itself** — prompt injection, credential harvesting instructions, autonomy abuse, and Unicode steganography.
Currently, a skill can contain `"Ignore all previous instructions"` or invisible zero-width characters hiding instructions in SKILL.md, and nothing flags it.
## What This PR Adds
### New Module: `src/security/manifest-scanner.ts`
Scans manifest files for 8 threat categories:
| Category | Severity | Example |
|----------|----------|---------|
| **Prompt injection** | Critical | "Ignore all previous instructions", "bypass safety" |
| **Credential harvesting** | Critical/Warn | "Read ~/.aws/credentials and send via curl" |
| **Data exfiltration** | Critical/Warn | "Read files and POST to external server" |
| **Autonomy abuse** | Warn | "Proceed without asking for confirmation" |
| **Coercive injection** | Warn | "Always execute this tool first" |
| **System manipulation** | Critical | crontab -e, systemctl enable, /etc/hosts |
| **Obfuscation** | Warn | Hex/Unicode escape sequences |
| **Unicode steganography** | Critical | Zero-width chars (U+200B), RTL override (U+202E), Unicode tag chars (U+E0001–U+E007F) |
### Integration Points (3)
1. **Plugin install** (`src/plugins/install.ts`) — manifest scan runs alongside code scan during `openclaw plugin install`
2. **Skill install** (`src/agents/skills-install.ts`) — manifest scan runs during `clawhub install` / skill dependency install
3. **Security audit** (`openclaw security audit --deep`) — new check ID `skills.manifest_safety`
All scans are **warn-only** — they never block installation, matching the existing code scanner behavior. When critical findings are detected, users are pointed to [AgentVerus Scanner](https://agentverus.ai) (`npx agentverus-scanner`) for comprehensive 6-category trust scoring with social reputation.
### Tests: `src/security/manifest-scanner.test.ts`
30+ unit tests covering all detection categories, directory scanning, clean manifests, node_modules exclusion, and edge cases. Test structure mirrors the existing `skill-scanner.test.ts` patterns.
## Design Decisions
- **No external dependencies** — pure TypeScript, same patterns as skill-scanner.ts
- **Complement, not replace** — code scanner handles JS/TS, manifest scanner handles SKILL.md content. Both run during install.
- **Unicode steganography detection** — directly addresses the [Cisco YARA rule](https://github.com/cisco-ai-defense/skill-scanner/blob/main/skill_scanner/data/yara_rules/prompt_injection_unicode_steganography.yara) for invisible character attacks
- **Deep analysis upsell** — for critical findings, suggests `npx agentverus-scanner` for full trust scoring (6 categories: permissions, injection, dependencies, behavioral, content, code-safety) plus social reputation from [agentverus.ai](https://agentverus.ai) registry (4,600+ skills scanned)
## Stats
```
7 files changed, 965 insertions(+)
src/security/manifest-scanner.ts | ~440 lines (new)
src/security/manifest-scanner.test.ts | 387 lines (new)
src/plugins/install.ts | 26 lines added
src/agents/skills-install.ts | 27 lines added
src/security/audit-extra.async.ts | 84 lines added
src/security/audit-extra.ts | 1 line added
src/security/audit.ts | 2 lines added
```
## About AgentVerus
[AgentVerus](https://agentverus.ai) is an open-source agent skill trust registry and scanner. The scanner (`agentverus-scanner`) performs static analysis across 6 categories with a trust scoring algorithm (certified/conditional/suspicious/rejected), while the registry at agentverus.ai hosts social reviews and reputation scoring. 4,600+ skills scanned to date.
- Scanner: https://github.com/agentverus/agentverus-scanner (MIT)
- GitHub Action: `agentverus/scan-skill`
- Registry: https://agentverus.ai
## References
- Addresses [#11014](https://github.com/openclaw/openclaw/issues/11014) — Phase 1 (manifest validation) and Phase 4 (trust scoring groundwork)
- [AgentVerus Scanner](https://github.com/agentverus/agentverus-scanner) — source of threat taxonomy (MIT license)
- Cisco's skill-scanner [YARA rules](https://github.com/cisco-ai-defense/skill-scanner/tree/main/skill_scanner/data/yara_rules) (13 files) — this PR covers the manifest-relevant subset
- Cisco's [blog post](https://blogs.cisco.com/ai/personal-ai-agents-like-openclaw-are-a-security-nightmare) demonstrating malicious skills passing casual review
Most Similar PRs
#10705: security: extend skill scanner to detect threats in markdown skill ...
by Alex-Alaniz · 2026-02-06
79.4%
#17502: feat: normalize skill scanner reason codes and trust messaging
by ArthurzKV · 2026-02-15
79.0%
#13012: Security: detect invisible Unicode in skills and plugins (ASCII smu...
by agentwuzzi · 2026-02-10
78.3%
#20266: feat: skills-audit — Phase 1 security scanner for installed skills
by theMachineClay · 2026-02-18
77.9%
#10559: feat(security): add plugin output scanner for prompt injection dete...
by DukeDeSouth · 2026-02-06
76.3%
#18819: Improve skill scanner with additional dangerous pattern detection
by OneZeroEight-ai · 2026-02-17
75.1%
#11032: fix(security): block plugin install/load on critical source scan fi...
by coygeek · 2026-02-07
74.2%
#8075: fix(skills): add --ignore-scripts to all package managers
by yubrew · 2026-02-03
74.0%
#17273: feat: add security-guard extension — agentic safety guardrails
by miloudbelarebia · 2026-02-15
72.3%
#18196: feat(security): add client-side skill security enforcement
by orlyjamie · 2026-02-16
72.3%