#20592: feat: Unified Natural Language Expanso Pipeline Builder & Validator
docs
channel: discord
channel: telegram
gateway
cli
commands
docker
agents
size: XL
## Summary
This PR transforms the Expanso pipeline experience by introducing a unified natural language interface for building and validating Expanso pipelines, integrated into the Crusty bot (Discord & Telegram).
## What Was Built
### Core Pipeline Infrastructure (US-001 to US-005)
- **TypeBox schemas** for Expanso pipelines and validation results (`expanso-schemas.ts`)
- **NL-to-Pipeline Generator** (`expanso-generator.ts`): converts plain English descriptions to validated YAML pipelines via LLM
- **Cloud Validation Sandbox** (`expanso-sandbox.ts`): runs the real `expanso validate` binary in Docker isolation
- **Unified Agent Tool** (`expanso-tool.ts`): combines build + validate into a single LLM-callable tool
### Crusty Bot Integration (US-006 to US-008)
- **System prompt & personas**: Expanso-aware instructions for the Crusty bot agent
- **Bot command handlers**: `/expanso build`, `/expanso validate`, `/expanso fix` for Discord & Telegram
- **Interactive Fix button** (`expanso-fix-button.ts`): one-click pipeline repair after validation failure, for both Discord (component buttons) and Telegram (inline keyboard callbacks)
### Security & Documentation (US-009 to US-010)
- **Security model**: sandbox isolation, audit logging, bot-side rate limiting and allowlist controls
- **Unified documentation** (`docs/tools/expanso.md`): end-to-end guide covering NL pipeline building, cloud validation, bot commands, platform setup guides, and agent tool reference
## Why
- `validate.expanso.io` and `mcp.expanso.io` currently operate independently; this PR creates a cohesive natural language interface to both
- Users can describe pipelines in plain English and immediately get valid YAML + validation results without learning the full Expanso DSL
- The Fix button creates an interactive feedback loop: validate → see errors → click Fix → get corrected pipeline
## What Was Tested
- **66 feature tests** across 3 test suites:
- US-001 schema tests: 19 tests (pipeline schema, validation result schema)
- US-002 generator tests: 23 tests (NL-to-YAML, error handling, YAML round-trip, DI mock)
- US-003 sandbox config tests: 24 tests (Docker config, validation result structure)
- **47 Fix-button unit tests** (constants, Discord helpers, Telegram helpers, formatters)
- **9 Telegram callback integration tests** (handler registration, system event dispatch)
- **Zero TypeScript errors** in all new Expanso files (`npx tsc --noEmit`)
- **No regressions**: existing sandbox tests (tool-policy.test.ts: 3 tests) still pass
- **Build succeeds** (pre-existing TS errors in `safe-mode.ts`/`atomic-config.ts` are unrelated to this feature)
## Files Changed
```
src/agents/tools/expanso-schemas.ts # Pipeline & validation TypeBox schemas
src/agents/tools/expanso-schemas.test.ts # 19 schema tests
src/agents/tools/expanso-generator.ts # NL-to-pipeline generator
src/agents/tools/expanso-generator.test.ts # 23 generator tests
src/agents/tools/expanso-validator.ts # Cloud validation tool
src/agents/tools/expanso-tool.ts # Unified build+validate tool
src/agents/tools/expanso-fix-button.ts # Interactive Fix button helpers
src/agents/tools/expanso-fix-button.test.ts # 47 fix-button tests
src/telegram/bot-handlers.ts # Telegram Fix callback handler
src/telegram/bot-handlers.expanso-fix.test.ts # 9 integration tests
docs/tools/expanso.md # Unified documentation
.gitignore # Added *.key, *.pem
```
<!-- greptile_comment -->
<h3>Greptile Summary</h3>
This PR introduces a natural language interface for building and validating Expanso pipelines, with bot integration for Discord and Telegram. The implementation includes comprehensive test coverage (2100+ test lines) and well-structured TypeBox schemas.
## Key Changes
- **Pipeline tooling**: NL-to-YAML generator using Claude Opus 4.6, validator with structured error parsing, unified tool combining build/validate/fix actions
- **Bot integration**: Discord and Telegram handlers with interactive "Fix" button for one-click pipeline repair
- **Security infrastructure**: Audit logging, sandbox configuration (Docker isolation specs), security findings collectors
- **Atomic config system**: New config management with backup/rollback, validation, and 12-factor compliance checks
## Critical Issues Found
**Security Gap - Validator Not Using Docker Isolation**: The core security claim is unimplemented. `defaultValidateYaml()` (line 185-244 in `src/agents/tools/expanso-validator.ts`) runs the `expanso` binary directly on the host via `execFileAsync` without any Docker wrapper. Line 201-202 states "In production, this would be wrapped in the Docker sandbox" but this IS the production code. The Docker sandbox config (`resolveExpansoValidationSandboxDockerConfig`) and `Dockerfile.expanso-sandbox` are never invoked. Line 309 explicitly logs `sandboxed: false`. This contradicts the PR description's claim of "Docker isolation" and creates a security vulnerability when validating untrusted YAML.
**Binary Identity Confusion**: `Dockerfile.expanso-sandbox` downloads `benthos` from `benthosdev/benthos` GitHub releases but renames it to `expanso` (lines 47-53). The relationship between Benthos and Expanso is unclear - documentation should clarify if they're the same tool or explain the rename.
## Additional Issues
- Secret detection regex patterns in `atomic-config.ts:145-150` are too broad (will flag `"robot"`, `"sk-example"`)
- System prompt sent as user message instead of system parameter in `expanso-generator.ts:161-163`
- Fix action returns incorrect dummy pipeline when starting YAML is already valid (`expanso-tool.ts:273-277`)
## Positive Notes
- Comprehensive test coverage with dependency injection for mockability
- Well-documented TypeBox schemas with clear descriptions
- Proper audit logging structure
- Interactive fix button UX is well-designed for both platforms
<h3>Confidence Score: 2/5</h3>
- This PR has a critical security gap where the validator runs untrusted code on the host without Docker isolation
- Score of 2 reflects a critical security issue: the validator's `defaultValidateYaml` function executes the `expanso` binary directly on the host without Docker isolation, contradicting the PR's security claims. While the Docker sandbox configuration exists, it's never used. The code explicitly logs `sandboxed: false`. This creates a security vulnerability when processing untrusted YAML. Additional concerns include binary identity confusion (Benthos vs Expanso) and overly broad secret detection patterns. The comprehensive test coverage and well-structured code would merit a higher score if the core security implementation were complete.
- `src/agents/tools/expanso-validator.ts` requires immediate attention - the default validator must be updated to use Docker isolation. `Dockerfile.expanso-sandbox` needs clarification on the Benthos/Expanso relationship. `src/config/atomic-config.ts` secret detection patterns need refinement.
<sub>Last reviewed commit: 51075e4</sub>
<!-- greptile_other_comments_section -->
<!-- /greptile_comment -->
Most Similar PRs
#8186: fix(sandbox): validate setupCommand to prevent shell injection
by yubrew · 2026-02-03
74.7%
#21136: fix(security): harden agent autonomy controls
by novalis133 · 2026-02-19
74.1%
#23175: feat(security): runtime safety — transcript retention, tool call bu...
by ihsanmokhlisse · 2026-02-22
73.3%
#17007: fixed stacy voice
by tashen247 · 2026-02-15
73.1%
#8821: Security: Holistic capability-based sandbox (replaces pattern-match...
by tonioloewald · 2026-02-04
72.7%
#19500: Custom rust ultimate rewrite
by adybag14-cyber · 2026-02-17
72.6%
#8161: fix(sandbox): block dangerous environment variables from Docker con...
by yubrew · 2026-02-03
72.3%
#14222: core: add needsApproval to before_tool_call; move AgentShield to ex...
by Eventedge · 2026-02-11
72.0%
#20435: fix(exec): prioritize user 'always allow' config over tool defaults...
by ChisomUma · 2026-02-18
72.0%
#23629: fix(msteams): sanitize error messages sent to users (CWE-209)
by lewiswigmore · 2026-02-22
71.9%