← Back to PRs

#22387: fix: session_status context tracking undercount for cached providers

by 1ucian open 2026-02-21 03:33 View on GitHub →
agents size: XS
## Problem `session_status` reports dramatically undercounted context usage. For example, showing `28k/1.0m (3%)` when actual context is `185k/1.0m (19%)`. This affects all providers with prompt caching (Anthropic Claude especially), where `usage.input` is very small (1-10 tokens for cache hits) while `cache_read_input_tokens` holds the actual context size. ## Root Cause Two issues compound: 1. **`session-status-tool.ts` sets `includeTranscriptUsage: false`** — The session entry's `totalTokens` is updated at the END of each turn, but `session_status` runs MID-turn (it's a tool call). So it always reads the previous turn's stale snapshot. The transcript has the real usage from the last API response. 2. **Performance concern** — `readUsageFromSessionLog` reads the entire session log file. For long sessions this is wasteful since we only need the last usage entry. ## Fix 1. Enable `includeTranscriptUsage: true` for the session_status tool so it falls back to transcript-derived usage when the session entry is stale. 2. Optimize `readUsageFromSessionLog` to read only the last 8KB of the log file (tail read) instead of the entire file, mitigating the performance concern. ## Verification Before fix: `Context: 28k/1.0m (3%)` Session log shows: `cacheRead: 184186, input: 1, cacheWrite: 707` → actual context ~185k Session store shows: `totalTokens: 165862` (from previous turn) After fix: `derivePromptTokens` correctly sums `input + cacheRead + cacheWrite` = ~185k Related issues: #17799, #16079 <!-- greptile_comment --> <h3>Greptile Summary</h3> Fixed `session_status` severely undercounting context usage for providers with prompt caching (particularly Anthropic Claude). The issue had two parts: 1. **Stale snapshot problem**: `session_status` runs mid-turn as a tool call, so it reads the session entry before it's updated with current turn's usage. The fix enables `includeTranscriptUsage: true` in `session-status-tool.ts:384` to fall back to transcript-derived usage which includes cache tokens. 2. **Performance optimization**: Changed `readUsageFromSessionLog` to read only the last 8KB of the log file (tail read) instead of the entire file, preventing performance degradation on long sessions. The fix correctly addresses the root cause by ensuring `derivePromptTokens` sums `input + cacheRead + cacheWrite` tokens rather than just using the stale `input` count. The tail read optimization uses proper offset calculation and handles edge cases (small files, partial lines) correctly. <h3>Confidence Score: 4/5</h3> - This PR is safe to merge with low risk - it fixes a critical bug in usage reporting with minimal code changes and good test coverage - The changes are well-targeted and solve a real bug. The tail read optimization is sound with proper edge case handling (offset calculation, partial line skipping, buffer size handling). Existing tests verify the fix works correctly. Minor deduction because the tail read introduces a theoretical edge case if log files have extremely long lines (>8KB), though this is very unlikely in practice given the JSONL format. - No files require special attention <sub>Last reviewed commit: e1343e7</sub> <!-- greptile_other_comments_section --> <sub>(2/5) Greptile learns from your feedback when you react with thumbs up/down!</sub> <!-- /greptile_comment -->

Most Similar PRs