#6730: feat: Make OpenAI Codex CLI models usable - reasoning effort directive
agents
Cluster:
Context Window and Model Updates
# Make OpenAI Codex CLI Models Actually Usable
## The Problem
Using OpenAI models through Codex CLI in OpenClaw is painful:
1. **Model switching is cryptic** — You need to know exact model IDs like `openai-codex/gpt-5.2-codex` instead of just saying "switch to codex"
2. **Reasoning effort is inaccessible** — Codex CLI supports `model_reasoning_effort` (low/medium/high) but there's no way to control it from OpenClaw. You're stuck with whatever's in your config file.
3. **New models require manual config** — When OpenAI releases new models, you have to manually add them to your allowlist
**What users want:**
```
switch to codex high reasoning
```
**What they have to do today:**
```
/model openai-codex/gpt-5.2-codex
# Then manually edit ~/.openclaw/config.json to set reasoning effort
# Then restart the gateway
```
## The Solution
### 1. Natural Language Model Switching + Reasoning Effort
Just say what you want:
```
switch to codex high reasoning
use gpt-5.2 reasoning effort medium
/effort high
```
The system parses natural language, switches models, AND sets reasoning effort in one step. It passes `-c model_reasoning_effort="high"` to Codex CLI automatically.
### 2. Auto-Discover Available Models
```bash
openclaw models sync --yes
```
Discovers all models available through your CLI backends and adds them to your allowlist. No more manual config editing when new models drop.
## How It Works
### Reasoning Effort Directive
New `/effort` directive (like `/thinking` or `/verbose`):
- `/effort high` — deep reasoning
- `/effort medium` — balanced
- `/effort low` — faster responses
- `/effort none` — disable
Also parses natural language:
- "high reasoning"
- "reasoning effort medium"
- "switch to codex low reasoning"
Session-persistent — set it once, it sticks until you change it.
### Model Sync Command
```bash
openclaw models sync # Discover all
openclaw models sync --provider codex # Just Codex models
openclaw models sync --dry-run # Preview changes
```
## Files Changed
**Model Sync Command** (3 files)
- `commands/models/sync.js` — NEW: main sync implementation
- `cli/models-cli.js` — register CLI command
- `commands/models.js` — export for CLI binding
**Reasoning Effort Directive** (7 files)
- `auto-reply/thinking.js` — `normalizeReasoningEffort()`
- `auto-reply/reply/directives.js` — `extractReasoningEffortDirective()`
- `auto-reply/reply/directive-handling.parse.js` — parsing chain
- `auto-reply/reply/directive-handling.impl.js` — session persistence
- `auto-reply/reply/get-reply-run.js` — pass to runner
- `auto-reply/reply/get-reply-directives-apply.js` — directive detection
- `auto-reply/reply/get-reply-directives-utils.js` — clear directives
**CLI Runner Integration** (4 files)
- `auto-reply/reply/agent-runner-execution.js` — build configOverrides for Codex
- `agents/cli-runner.js` — accept configOverrides param
- `agents/cli-runner/helpers.js` — add to CLI args
- `agents/cli-backends.js` — `configArg: "-c"` for Codex backend
## Testing
```bash
# Test model sync
openclaw models sync --dry-run
# Test reasoning effort
/effort high
# Then send a message — check logs for: -c model_reasoning_effort="high"
```
## Why This Matters
OpenAI's reasoning models are powerful but the UX to use them through OpenClaw is terrible. This PR makes it dead simple: say what you want in plain English and it just works.
---
*All changes are backward compatible. Existing configs and workflows unaffected.*
<!-- greptile_comment -->
<h2>Greptile Overview</h2>
<h3>Greptile Summary</h3>
This PR adds a new “reasoning effort” directive, persists it in session state, and threads it through the reply/runner pipeline as provider-specific CLI config overrides (notably for the Codex backend via a configurable `configArg`). It also extends CLI backend config to support passing arbitrary config overrides down into the CLI argv.
Main risk areas are correctness of the override value formatting and directive edge cases (invalid effort levels being stripped without user feedback, and no clear/unset path back to defaults). There’s also a very large new npm lockfile which may be unintended churn given the repo’s pnpm/bun workflow.
<h3>Confidence Score: 3/5</h3>
- Moderately safe to merge, but there are a couple of correctness edge cases that can make the new feature silently not work as intended.
- Core changes are localized and follow existing directive/session patterns, but the Codex config override formatting looks likely to be wrong (quoted effort token), and invalid effort directives can be stripped without feedback. The new package-lock.json is large and may be unintended churn.
- src/auto-reply/reply/agent-runner-execution.ts, src/auto-reply/reply/directives.ts, src/auto-reply/reply/directive-handling.impl.ts, package-lock.json
<!-- greptile_other_comments_section -->
<sub>(2/5) Greptile learns from your feedback when you react with thumbs up/down!</sub>
<!-- /greptile_comment -->
Most Similar PRs
#12220: fix: forward-compat models now respect user-configured contextWindow
by Batuhan4 · 2026-02-09
78.7%
#16099: feat: add opencode-cli as CLI backend provider
by imwxc · 2026-02-14
77.9%
#8455: feat: add thinking/model config to skills.entries
by tsukhani · 2026-02-04
77.4%
#11089: feat(compaction): support customInstructions and model override for...
by p697 · 2026-02-07
77.2%
#11561: fix: respect supportsReasoningEffort compat flag for xAI/Grok reaso...
by baxter-lindsaar · 2026-02-08
76.7%
#19311: feat: add github-copilot gpt-5.3-codex with xhigh support (AI-assis...
by mrutunjay-kinagi · 2026-02-17
76.5%
#7292: feat: Implement subagent model inheritance system
by levineam · 2026-02-02
76.3%
#21884: feat(models): auth improvements — status command, heuristics, multi...
by kckylechen1 · 2026-02-20
75.9%
#8821: Security: Holistic capability-based sandbox (replaces pattern-match...
by tonioloewald · 2026-02-04
75.9%
#17304: feat(gemini): robust handling for non-XML reasoning headers (`Think...
by YoshiaKefasu · 2026-02-15
75.8%