#6730: feat: Make OpenAI Codex CLI models usable - reasoning effort directive

by levineam open 2026-02-02 00:39 View on GitHub →

agents

Cluster: Context Window and Model Updates

# Make OpenAI Codex CLI Models Actually Usable ## The Problem Using OpenAI models through Codex CLI in OpenClaw is painful: 1. **Model switching is cryptic** — You need to know exact model IDs like `openai-codex/gpt-5.2-codex` instead of just saying "switch to codex" 2. **Reasoning effort is inaccessible** — Codex CLI supports `model_reasoning_effort` (low/medium/high) but there's no way to control it from OpenClaw. You're stuck with whatever's in your config file. 3. **New models require manual config** — When OpenAI releases new models, you have to manually add them to your allowlist **What users want:** ``` switch to codex high reasoning ``` **What they have to do today:** ``` /model openai-codex/gpt-5.2-codex # Then manually edit ~/.openclaw/config.json to set reasoning effort # Then restart the gateway ``` ## The Solution ### 1. Natural Language Model Switching + Reasoning Effort Just say what you want: ``` switch to codex high reasoning use gpt-5.2 reasoning effort medium /effort high ``` The system parses natural language, switches models, AND sets reasoning effort in one step. It passes `-c model_reasoning_effort="high"` to Codex CLI automatically. ### 2. Auto-Discover Available Models ```bash openclaw models sync --yes ``` Discovers all models available through your CLI backends and adds them to your allowlist. No more manual config editing when new models drop. ## How It Works ### Reasoning Effort Directive New `/effort` directive (like `/thinking` or `/verbose`): - `/effort high` — deep reasoning - `/effort medium` — balanced - `/effort low` — faster responses - `/effort none` — disable Also parses natural language: - "high reasoning" - "reasoning effort medium" - "switch to codex low reasoning" Session-persistent — set it once, it sticks until you change it. ### Model Sync Command ```bash openclaw models sync # Discover all openclaw models sync --provider codex # Just Codex models openclaw models sync --dry-run # Preview changes ``` ## Files Changed **Model Sync Command** (3 files) - `commands/models/sync.js` — NEW: main sync implementation - `cli/models-cli.js` — register CLI command - `commands/models.js` — export for CLI binding **Reasoning Effort Directive** (7 files) - `auto-reply/thinking.js` — `normalizeReasoningEffort()` - `auto-reply/reply/directives.js` — `extractReasoningEffortDirective()` - `auto-reply/reply/directive-handling.parse.js` — parsing chain - `auto-reply/reply/directive-handling.impl.js` — session persistence - `auto-reply/reply/get-reply-run.js` — pass to runner - `auto-reply/reply/get-reply-directives-apply.js` — directive detection - `auto-reply/reply/get-reply-directives-utils.js` — clear directives **CLI Runner Integration** (4 files) - `auto-reply/reply/agent-runner-execution.js` — build configOverrides for Codex - `agents/cli-runner.js` — accept configOverrides param - `agents/cli-runner/helpers.js` — add to CLI args - `agents/cli-backends.js` — `configArg: "-c"` for Codex backend ## Testing ```bash # Test model sync openclaw models sync --dry-run # Test reasoning effort /effort high # Then send a message — check logs for: -c model_reasoning_effort="high" ``` ## Why This Matters OpenAI's reasoning models are powerful but the UX to use them through OpenClaw is terrible. This PR makes it dead simple: say what you want in plain English and it just works. --- *All changes are backward compatible. Existing configs and workflows unaffected.*  <h2>Greptile Overview</h2> <h3>Greptile Summary</h3> This PR adds a new “reasoning effort” directive, persists it in session state, and threads it through the reply/runner pipeline as provider-specific CLI config overrides (notably for the Codex backend via a configurable `configArg`). It also extends CLI backend config to support passing arbitrary config overrides down into the CLI argv. Main risk areas are correctness of the override value formatting and directive edge cases (invalid effort levels being stripped without user feedback, and no clear/unset path back to defaults). There’s also a very large new npm lockfile which may be unintended churn given the repo’s pnpm/bun workflow. <h3>Confidence Score: 3/5</h3> - Moderately safe to merge, but there are a couple of correctness edge cases that can make the new feature silently not work as intended. - Core changes are localized and follow existing directive/session patterns, but the Codex config override formatting looks likely to be wrong (quoted effort token), and invalid effort directives can be stripped without feedback. The new package-lock.json is large and may be unintended churn. - src/auto-reply/reply/agent-runner-execution.ts, src/auto-reply/reply/directives.ts, src/auto-reply/reply/directive-handling.impl.ts, package-lock.json  <sub>(2/5) Greptile learns from your feedback when you react with thumbs up/down!</sub>