← Back to PRs

#11970: feat: add model.compact config for dedicated compaction model

by meaadore1221-afk open 2026-02-08 16:41 View on GitHub →
docs channel: signal app: macos gateway scripts agents size: XL
#### Summary Add `agents.defaults.model.compact` config option to specify a dedicated model for compaction summarization, with automatic fallback to primary on failure. **Why not use fallback models?** Compaction quality directly affects context accuracy after summary — a degraded summary means the agent loses critical conversation context. Fallback models are typically cheaper/faster alternatives chosen for cost optimization, not quality. The compact model should be a capable model you trust for summarization, which may differ from both primary and fallbacks. **Real-world scenario:** Using `google-gemini-cli/gemini-3-pro-preview` as primary — the CLI-based provider returns HTTP 400 on the summarization API even after recent updates. Auto-compaction failures are silent (no user notification), so the session keeps growing until it hits a hard context overflow. With `model.compact`, you can route compaction through a direct API provider (`myapi/gemini-3-pro-preview`) that supports summarization reliably, while keeping the CLI provider as primary for regular conversation. lobster-biscuit #### Use Cases 1. Primary model provider does not support summarization API (CLI wrappers, some proxy setups) 2. Want a specifically capable model for compaction to preserve context accuracy 3. Prevent silent auto-compaction failures that lead to eventual context overflow #### Behavior Changes - New config key: `agents.defaults.model.compact` (string, `provider/model` format) - If unset or same as primary: no behavior change (identical to before) - If set to a distinct model: compact model tried first → on failure → automatic fallback to primary - Compaction no longer writes a model snapshot to the session file (fixes spurious "model changed" events when using a different model for compaction) #### Existing Functionality Check - [x] I searched the codebase for existing functionality. - Searched for `model.compact`, `compaction.*model` — no existing compact model override - Checked `model-fallback.ts`, `model-selection.ts` — fallback logic is for run-time failover, not compaction-specific - Reviewed `imageModel` pattern — similar concept (dedicated model for a specific capability) #### Tests - `compact.test.ts`: 8 unit tests for `resolveCompactModelRef` covering all config permutations (no config, provider/model format, model-only format, whitespace, empty string, defaults) - `run.overflow-compaction.test.ts`: existing 8 tests pass unchanged (auto-compaction in run loop) - All 47 related test files (config + compact) pass: 321/321 #### Files Changed | File | Change | |---|---| | `src/config/types.agent-defaults.ts` | Add `compact?: string` to `AgentModelListConfig` | | `src/config/zod-schema.agent-defaults.ts` | Add `compact` to model Zod schema | | `src/agents/pi-embedded-runner/compact.ts` | Add `resolveCompactModelRef`, wrapper with fallback, skip model snapshot during compact | | `src/agents/pi-embedded-runner/compact.test.ts` | New: unit tests for compact model resolution | | `docs/gateway/configuration.md` | Document `model.compact` config | | `docs/concepts/compaction.md` | Add dedicated compaction model section | | `CHANGELOG.md` | Add changelog entry | **Sign-Off** - Models used: Claude Opus 4.6 - Submitter effort: high (design + implementation + review + testing + docs) Made with [Cursor](https://cursor.com) <!-- greptile_comment --> <h2>Greptile Overview</h2> <h3>Greptile Summary</h3> This PR primarily adds a dedicated compaction model override via `agents.defaults.model.compact`, updates config types/schemas and compaction logic to resolve an alternate provider/model for summarization, and falls back to the caller’s primary model when the compact model fails. It also adjusts compaction behavior to avoid writing a transient model snapshot during compaction (preventing spurious “model changed” events). In addition to the compaction feature, this diff includes a number of unrelated changes (new dev scripts, large local README notes, Signal enhancements/idle reminder/cron fixes, streaming buffering changes, docs/images). Those extra changes make it hard to review and increase merge risk because they introduce new behaviors far outside the stated PR scope. <h3>Confidence Score: 2/5</h3> - This PR has blocking issues due to a definite repeated-wrapping bug and several unrelated changes bundled into one diff. - While the `model.compact` feature itself is reasonably contained, the change in `run/attempt.ts` will stack stream buffering wrappers across turns, which will reliably increase memory usage and alter streaming behavior in long-lived sessions. The PR also includes unrelated files (local README, destructive mac rebuild script, broad Signal/cron/idle reminder changes) which significantly expand surface area and should be split out before merging. - src/agents/pi-embedded-runner/run/attempt.ts; README_AMENDED.md; scripts/rebuild-mac.sh <!-- greptile_other_comments_section --> <!-- /greptile_comment -->

Most Similar PRs