#10273: fix(agents): detect and auto-compact mid-run context overflow

by terryops open 2026-02-06 08:58 View on GitHub →

app: web-ui gateway agents stale

Cluster: Context Management Enhancements

## Summary - **Mid-run context overflow detection**: When context overflow occurs *after* the initial prompt succeeds (during the agent's tool-call loop), it is now detected via `lastAssistantErrorMessage` from subscription events and triggers auto-compaction with retry. Previously only the `promptError` path caught overflow on the initial prompt. - **Staged summarization fallback**: When the SDK's built-in `session.compact()` fails on oversized contexts, falls back to `summarizeInStages` which chunks messages into smaller pieces that fit within the context window. - **Empty WS chat final fix**: When the agent uses only `tool_use` blocks or errors before producing text, synthetic assistant events are emitted so webchat clients don't receive `state=final` with no content. Gateway also sends corrective final/error broadcasts as a safety net. ## Changes | Area | Files | What | |------|-------|------| | Runner | `run.ts`, `run/attempt.ts`, `run/types.ts` | Mid-run overflow detection + compaction retry loop | | Compaction | `compact.ts` | Staged summarization fallback for oversized contexts | | Subscription | `subscribe.ts`, `handlers.messages.ts`, `handlers.types.ts`, `handlers.lifecycle.ts` | Track `lastAssistantErrorMessage`, emit synthetic assistant events | | Gateway | `server-chat.ts`, `server-methods/chat.ts`, `server-methods/types.ts`, `server.impl.ts` | `emptyFinalRuns` tracking, corrective final/error broadcasts | ## Test plan - [x] `pnpm build` — no type errors - [x] `pnpm check` — lint/format pass - [x] `pnpm test` — 213 tests pass (35 files) - [ ] Manual: trigger compaction in a long conversation and verify recovery 🤖 Generated with [Claude Code](https://claude.com/claude-code)  <h2>Greptile Overview</h2> <h3>Greptile Summary</h3> - Adds mid-run context overflow detection by propagating the last assistant error message from subscription events into the runner and triggering auto-compaction + retry. - Extends session compaction with a staged summarization fallback when the SDK’s `session.compact()` fails on oversized contexts. - Improves WS/webchat behavior for tool_use-only or error-only runs by emitting synthetic assistant events and adding a gateway “corrective final/error” broadcast path. - Introduces `emptyFinalRuns` tracking in the gateway to detect empty finals and attempt post-hoc correction. <h3>Confidence Score: 3/5</h3> - This PR is likely safe to merge after addressing a few correctness issues in the gateway empty-final tracking and staged compaction credentials. - Core overflow-detection/compaction retry logic is straightforward, but the new gateway correction mechanism can leak state and potentially emit multiple finals for a single run, which can break clients. The staged compaction fallback also has a plausible credential-resolution mismatch that can cause compaction to fail even when an API key is configured. - src/gateway/server-chat.ts, src/gateway/server-methods/chat.ts, src/agents/pi-embedded-runner/compact.ts  <sub>(2/5) Greptile learns from your feedback when you react with thumbs up/down!</sub>