#16298: feat(xai): switch grok-4-1-fast variants by thinking level
gateway
agents
stale
size: M
## Summary
- Add thinking-aware model-family resolution for xAI `grok-4-1-fast` so runtime routing selects `grok-4-1-fast-reasoning` for thinking-on levels and `grok-4-1-fast-non-reasoning` for `/think off`.
- Integrate the resolver into reply/runtime session model resolution and status output so effective model selection is consistent across execution paths.
- Add targeted coverage for family detection, runtime routing behavior, and session model resolution fallbacks.
## Verification
- `pnpm test -- src/agents/model-selection-thinking.test.ts src/auto-reply/reply/model-selection.override-respected.test.ts src/gateway/session-utils.model-ref.test.ts`
- `pnpm tsgo`
- `pnpm build`
<!-- greptile_comment -->
<h3>Greptile Summary</h3>
Added thinking-aware model resolution for xAI `grok-4-1-fast` that automatically switches between `-reasoning` and `-non-reasoning` variants based on thinking level (`/think off` vs thinking-on).
- Introduced `model-families.ts` to define reasoning model families and `isReasoningFamilyAllowed` to check allowlist compatibility
- `resolveThinkingAwareModelRef` in `model-selection-thinking.ts` handles runtime routing
- Integrated into reply execution (`get-reply-run.ts`), model selection state (`model-selection.ts`), session resolution (`session-utils.ts`), and status output (`commands-status.ts`)
- Added comprehensive test coverage across family detection, runtime routing, session model resolution, and allowlist behavior
The implementation follows existing patterns and provides good test coverage. One minor issue: the status command doesn't pass `allowedModelKeys` to `resolveThinkingAwareModelRef`, which could cause it to display a model variant that would be blocked by allowlist restrictions (though this is an edge case).
<h3>Confidence Score: 4/5</h3>
- This PR is safe to merge with one minor edge case to address
- The implementation is well-structured with comprehensive test coverage, follows existing codebase patterns, and correctly handles the core functionality. Score is 4 (not 5) due to the missing `allowedModelKeys` parameter in the status command's thinking-aware resolution, which could cause status output to show an incorrect model variant in edge cases involving allowlists. This is a display-only issue that wouldn't affect actual execution.
- src/auto-reply/reply/commands-status.ts requires attention for allowlist handling
<sub>Last reviewed commit: 07f51c4</sub>
<!-- greptile_other_comments_section -->
<sub>(4/5) You can add custom instructions or style guidelines for the agent [here](https://app.greptile.com/review/github)!</sub>
<!-- /greptile_comment -->
Most Similar PRs
#11561: fix: respect supportsReasoningEffort compat flag for xAI/Grok reaso...
by baxter-lindsaar · 2026-02-08
80.5%
#22797: Feat/auto thinking mode
by jrthib · 2026-02-21
79.0%
#20620: feat: add anthropic/claude-opus-4-6 to XHIGH_MODEL_REFS
by chungjchris · 2026-02-19
78.7%
#10998: fix(agents): pass session thinking/reasoning levels to session_stat...
by wony2 · 2026-02-07
78.3%
#16899: feat(config): per-agent and per-model thinking defaults
by jh280722 · 2026-02-15
78.0%
#7137: fix: add openai-codex/gpt-5.2 to XHIGH_MODEL_REFS
by sauerdaniel · 2026-02-02
77.3%
#21558: config: support agents.list[].thinkingDefault
by Uarmagan · 2026-02-20
77.0%
#19311: feat: add github-copilot gpt-5.3-codex with xhigh support (AI-assis...
by mrutunjay-kinagi · 2026-02-17
76.9%
#19384: Auto-reply: allow xhigh for OpenAI-compatible provider aliases
by 0x4007 · 2026-02-17
76.4%
#15606: LLM Task: add explicit thinking level wiring
by xadenryan · 2026-02-13
76.4%