#21279: Fix/sessions list cron model override

by altaywtf open 2026-02-19 21:37 View on GitHub →

size: M

## Summary - Problem: `sessions_list` shows the agent default model (e.g. Opus) instead of the actual cron payload model override (e.g. Sonnet), making cost tracking unreliable - Why it matters: Users running cron jobs on cheaper models see inflated cost estimates because the wrong model is reported - What changed: Set `model`/`modelProvider` on the session entry **before** the agent run (not only after), wrapped in best-effort try-catch - What did NOT change (scope boundary): Post-run model persistence is untouched — it still overwrites with the actual model used (e.g. if fallback kicked in) ## Change Type (select all) - [x] Bug fix - [ ] Feature - [ ] Refactor - [ ] Docs - [ ] Security hardening - [ ] Chore/infra ## Scope (select all touched areas) - [x] Gateway / orchestration - [ ] Skills / tool execution - [ ] Auth / tokens - [x] Memory / storage - [ ] Integrations - [ ] API / contracts - [ ] UI / DX - [ ] CI/CD / infra ## Linked Issue/PR - Closes #21057 ## User-visible / Behavior Changes `sessions_list` now correctly reflects the cron payload model override for failed and in-progress runs, instead of falling back to the agent default model. ## Security Impact (required) - New permissions/capabilities? No - Secrets/tokens handling changed? No - New/changed network calls? No - Command/tool execution surface changed? No - Data access scope changed? No ## Repro + Verification ### Environment - OS: Any - Runtime/container: Node.js - Model/provider: Any agent with a cron job that overrides `payload.model` - Integration/channel (if any): Cron / isolated agent - Relevant config (redacted): Agent defaulting to Opus with cron job specifying `model: "anthropic/claude-sonnet-4-6"` ### Steps 1. Configure an agent with default model `claude-opus-4-6` 2. Add a cron job with `payload.model: "anthropic/claude-sonnet-4-6"` 3. Trigger the cron job and call `sessions.list` while it runs or after it fails ### Expected - `sessions_list` shows `claude-sonnet-4-6` for the cron session ### Actual - `sessions_list` shows `claude-opus-4-6` (the agent default) ## Evidence - [x] Failing test/log before + passing after - [ ] Trace/log snippets - [ ] Screenshot/recording - [ ] Perf numbers (if relevant) 6 test cases covering: error path, race condition (point-in-time persist snapshots), disallowed model rejection, session-level `/model` override, persist failure resilience, and default model (no override). ## Human Verification (required) - Verified scenarios: All 28 `src/cron/isolated-agent/` tests pass; new tests confirmed failing before fix and passing after - Edge cases checked: Filesystem error on persist (best-effort), disallowed payload model early-return, session-level `/model` override, no payload override - What you did **not** verify: End-to-end with a live cron job and real `sessions.list` call ## Compatibility / Migration - Backward compatible? Yes - Config/env changes? No - Migration needed? No ## Failure Recovery (if this breaks) - How to disable/revert this change quickly: Revert the 2-line addition in `src/cron/isolated-agent/run.ts` (lines 433-434) - Files/config to restore: `src/cron/isolated-agent/run.ts` - Known bad symptoms reviewers should watch for: Model field on cron sessions showing unexpected values in `sessions_list` ## Risks and Mitigations - Risk: Pre-run persist writes a model that differs from the actual model used (e.g. if fallback triggers) - Mitigation: The post-run persist at line 523-524 overwrites with the actual model from `agentMeta` on success; the pre-run value is only visible during the run or after a failure  <h3>Greptile Summary</h3> Fixed race condition where `sessions_list` showed incorrect model for cron jobs when the run failed or was still in progress. The model override is now persisted to the session entry before the agent run executes, ensuring it reflects the intended cron model rather than falling back to the agent default. - Moved model and provider persistence from post-run telemetry block to pre-run, immediately after model resolution - Added try-catch wrapper around pre-run persist to make it best-effort (filesystem errors shouldn't block the agent run) - Added comprehensive test coverage for edge cases: run failures, race conditions during mid-run reads, disallowed models, session-level overrides, and persist failures <h3>Confidence Score: 5/5</h3> - This PR is safe to merge with minimal risk - The fix is well-isolated to cron model persistence logic with comprehensive test coverage. The best-effort error handling ensures the change cannot break existing functionality. All 6 test cases pass and cover critical scenarios including failures, race conditions, and edge cases. - No files require special attention <sub>Last reviewed commit: 8eb6b13</sub>