#16399: feat: auto-escalate thinking level based on context window usage

by cmfinlan open 2026-02-14 18:31 View on GitHub →

commands stale size: L

Cluster: Agent Thinking Defaults Enhancement

## Summary This PR implements automatic thinking level escalation based on context window usage. As conversations grow and approach the context window limit, models with low thinking levels can become prone to confident hallucinations. This feature automatically increases the thinking level when the context window fills up, helping maintain response quality in long sessions. ## Problem When using low thinking levels (e.g., "off", "minimal", "low") with models that have growing context: - The model becomes increasingly prone to confident hallucinations as context fills - Users may not realize they need to manually increase thinking levels for long conversations - Session quality degrades over time without clear feedback ## Solution Added opt-in `thinkingEscalation` configuration that automatically escalates thinking level when context window usage reaches configured thresholds. ### Configuration Example ```yaml agents: defaults: thinkingEscalation: enabled: true thresholds: - atContextPercent: 50 thinking: low - atContextPercent: 75 thinking: medium - atContextPercent: 90 thinking: high ``` ### Key Behaviors - **Opt-in**: Disabled by default - users must explicitly enable - **Only escalates**: Never downgrades thinking level within a session - **Threshold-based**: Configure multiple thresholds at different context percentages - **Highest applicable**: When multiple thresholds are met, uses the highest thinking level - **Persistent**: Updates are persisted to session store ### Changes 1. **Types** (`src/config/types.agent-defaults.ts`): Already had `AgentThinkingEscalationConfig` and `AgentThinkingEscalationThreshold` types 2. **Zod Schema** (`src/config/zod-schema.agent-defaults.ts`): Already had validation for `thinkingEscalation` 3. **Escalation Logic** (`src/auto-reply/reply/thinking-escalation.ts`): New module implementing escalation evaluation 4. **Integration** (`src/auto-reply/reply/agent-runner.ts`): Integrated escalation check after context window evaluation 5. **Tests** (`src/auto-reply/reply/thinking-escalation.test.ts`): Comprehensive test coverage ## Testing All new code is covered by tests: - Disabled escalation scenarios - Missing data handling - Escalation at various thresholds - No-downgrade guarantee - Multiple threshold selection - Session persistence ``` npx vitest run src/auto-reply/reply/thinking-escalation.test.ts ✓ 10 tests passed ``` ## Checklist - [x] TypeScript compiles without errors - [x] Tests pass - [x] Opt-in (disabled by default) - [x] Only escalates, never downgrades - [x] Minimal, focused diff - [x] No dist changes  <h3>Greptile Summary</h3> This PR adds opt-in automatic thinking level escalation based on context window usage, applied in two separate code paths: the auto-reply/messaging path (`thinking-escalation.ts` called from `agent-runner.ts`) and the CLI agent path (`session-store.ts`). The `session-store.ts` implementation includes provider/model validation via `listThinkingLevels()`, while the `thinking-escalation.ts` implementation does not (this was flagged in a prior thread). Types, Zod schema, and test coverage are included. - The core logic is well-structured: threshold-based, only-escalate semantics, clamped percentages, and best-effort persistence. - `THINKING_LEVEL_ORDER` is duplicated across three files — extracting it to a shared module would reduce drift risk. - `session-store.test.ts` re-implements `computeTargetThinkingLevel` locally instead of testing the actual function, which could mask regressions if the real implementation changes. <h3>Confidence Score: 4/5</h3> - This PR is safe to merge — the feature is opt-in, well-guarded, and non-breaking. - The escalation logic is correct and well-tested. The feature is opt-in (disabled by default), only escalates (never downgrades), and handles edge cases properly. Minor concerns: duplicated THINKING_LEVEL_ORDER constant across files, and test file re-implements rather than testing the actual function. The missing provider/model validation in thinking-escalation.ts was already flagged in a prior thread. - `src/commands/agent/session-store.test.ts` re-implements the function under test locally rather than importing it, which could mask regressions. <sub>Last reviewed commit: 4eb1db7</sub>