#15606: LLM Task: add explicit thinking level wiring

by xadenryan open 2026-02-13 17:23 View on GitHub →

docs extensions: llm-task size: S

Cluster: Agent Thinking Defaults Enhancement

## Problem `llm-task` could resolve and pass `provider/model`, but it had no explicit `thinking` input and did not forward a thinking level to embedded runs. As a result, workflow users could not intentionally set thinking behavior per `llm-task` call, and runs implicitly fell back to runner defaults. ## Summary of Solution This PR adds explicit thinking-level support to `llm-task` and aligns validation behavior with existing OpenClaw patterns. ### What changed - Added optional `thinking` parameter to `llm-task` tool input schema. - Normalized and validated `thinking` values using shared thinking helpers. - Added model-aware validation errors for invalid values. - Added explicit `xhigh` guardrails using existing support checks. - Passed validated thinking through to embedded runner as `thinkLevel`. - Updated docs to include the new `thinking` parameter and example usage. ### Implementation details - `extensions/llm-task/src/llm-task-tool.ts` - Added `thinking` to tool parameters. - Added normalization/validation: - invalid value -> clear error with supported levels hint - unsupported `xhigh` -> explicit model-support error - Forwarded to runner: - `runEmbeddedPiAgent(..., thinkLevel, ...)` - `extensions/llm-task/src/llm-task-tool.test.ts` - Added tests for: - passing explicit thinking override - alias normalization (`on` -> `low`) - invalid thinking rejection - unsupported `xhigh` rejection - omitted thinking regression (`thinkLevel` not passed) - Docs updated: - `extensions/llm-task/README.md` - `docs/tools/llm-task.md` - `docs/tools/lobster.md` ## Verification / Tests Performed ### Targeted unit tests - Command run: - `pnpm test extensions/llm-task/src/llm-task-tool.test.ts` - Result: - `1` test file passed - `13` tests passed - `0` failures ### Smoke Test Results 1. thinking: "low" path - Result payload: {"ok":true} - Gateway logs confirm llm-task sub-run received thinking: - provider=openai-codex model=gpt-5.3-codex thinking=low 2. Invalid thinking ("banana") - Result payload text: - Invalid thinking level "banana". Use one of: off, minimal, low, medium, high, xhigh. - This matches the new validation behavior. 3. Unsupported xhigh for Anthropic model - Result payload text: - Thinking level "xhigh" is only supported for ... - This matches the new xhigh guardrail behavior. 4. Omitted thinking (backward compatibility) - Result payload: {"ok":true} - Gateway logs confirm llm-task sub-run defaulted: - provider=openai-codex model=gpt-5.3-codex thinking=off ## Notes - No plugin config schema changes were introduced in this PR. - Existing `llm-task` calls without `thinking` remain backward-compatible.  <h2>Greptile Overview</h2> <h3>Greptile Summary</h3> This PR adds an optional `thinking` parameter to the `llm-task` extension tool schema, validates/normalizes it using the shared `src/auto-reply/thinking.ts` helpers, rejects unsupported `xhigh`, and forwards the validated level to embedded agent runs via `runEmbeddedPiAgent({ thinkLevel })`. Docs and unit tests were updated to cover the new parameter and basic validation/forwarding behavior. <h3>Confidence Score: 5/5</h3> - This PR is safe to merge with minimal risk. - Changes are localized to the llm-task extension, rely on existing shared thinking normalization/validation helpers, and are covered by targeted unit tests. The forwarded `thinkLevel` matches the embedded runner parameter (`RunEmbeddedPiAgentParams.thinkLevel`). - No files require special attention <sub>Last reviewed commit: dad5fd0</sub>  <sub>(3/5) Reply to the agent's comments like "Can you suggest a fix for this @greptileai?" or ask follow-up questions!</sub>