#8258: feat: Add smart model tiering for cost optimization
docs
agents
stale
Cluster:
Model Fallbacks and Rate Limiting
## Summary
- Automatically route simple queries (greetings, yes/no, short questions) to cheaper models
- - Preserve expensive models for complex tasks (coding, multi-step reasoning, long-form content)
- - Adds comprehensive budget guide documentation for running OpenClaw with local/cheap models
## Configuration
```json
{
"agents": {
"defaults": {
"model": {
"primary": "anthropic/claude-sonnet-4-5",
"tiering": {
"enabled": true,
"simple": "ollama/llama3.3"
}
}
}
}
}
```
## Features
- Pattern-based complexity classification (code requests, debugging, planning)
- - Configurable length threshold for complex queries (default: 500 chars)
- - Custom regex patterns for domain-specific complexity rules
- - Falls back to primary model for unclassified queries
## Test plan
- [x] Unit tests for complexity classification (28 tests passing)
- [ ] - [x] Tests for config resolution and model selection
- [ ] - [x] Edge cases (empty queries, custom patterns, invalid regex)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
<!-- greptile_comment -->
<h2>Greptile Overview</h2>
<h3>Greptile Summary</h3>
This PR adds “smart model tiering” so the reply pipeline can automatically route simple user messages (greetings/acks/very short prompts) to a configured cheaper model, while leaving complex queries on the primary model. It introduces a new `agents.defaults.model.tiering` config block (Zod schema + UI labels/help), integrates the model switch into `resolveReplyDirectives`, and adds a new `src/agents/model-tiering.ts` module with unit tests.
Docs are updated to include a new budget guide page and link it from the providers index, plus navigation is updated to surface the new doc.
<h3>Confidence Score: 4/5</h3>
- This PR is generally safe to merge, with a couple of schema/type consistency issues worth addressing.
- Core tiering logic is self-contained and covered by unit tests, and integration is gated on the absence of an explicit model directive. Main concerns are maintainability/type drift: `TieringConfig.enabled` is required in one place but optional everywhere else, and `rateLimitStrategy` is added to the Zod schema/UI hints without corresponding TS types or runtime behavior in this PR.
- src/agents/model-tiering.ts; src/config/zod-schema.agent-defaults.ts; src/config/types.agent-defaults.ts; src/config/schema.ts
<!-- greptile_other_comments_section -->
<sub>(3/5) Reply to the agent's comments like "Can you suggest a fix for this @greptileai?" or ask follow-up questions!</sub>
<!-- /greptile_comment -->
Most Similar PRs
#8256: feat: Add rate limit strategy configuration
by revenuestack · 2026-02-03
78.3%
#13658: fix: silent model failover with fallback notification
by taw0002 · 2026-02-10
78.1%
#9123: Feat/smart router backport and custom model provider
by JuliusYang3311 · 2026-02-04
77.1%
#16677: feat(routing): intelligent model routing via pre-route hook
by gonesurfing · 2026-02-15
76.2%
#7770: feat(routing): Smart Router V2 - Configuration-driven model dispatc...
by zzjj7000 · 2026-02-03
75.9%
#6730: feat: Make OpenAI Codex CLI models usable - reasoning effort directive
by levineam · 2026-02-02
75.7%
#7292: feat: Implement subagent model inheritance system
by levineam · 2026-02-02
75.6%
#20587: feat: add Tetrate Agent Router Service provider
by RicHincapie · 2026-02-19
75.4%
#14640: feat(agents): support per-agent temperature and maxTokens in agents...
by lailoo · 2026-02-12
75.2%
#11089: feat(compaction): support customInstructions and model override for...
by p697 · 2026-02-07
75.1%