#12996: feat(infra): Add session persistence with atomic writes and recovery
channel: slack
app: web-ui
gateway
agents
stale
Cluster:
Session Management Enhancements
## Summary
Adds session checkpoint/recovery system for handling interrupted sessions.
## Features
- Atomic JSON writes every N messages (default 10)
- findInterruptedSessions() on startup
- formatResumePrompt() for user notification
- Auto-cleanup of stale sessions >24h
## Files
- src/infra/session-persistence.ts
- src/infra/session-persistence.test.ts (10 tests)
<!-- greptile_comment -->
<h2>Greptile Overview</h2>
<h3>Greptile Summary</h3>
This PR introduces an infra-level session persistence/recovery capability (periodic checkpoints with atomic JSON writes, interrupted-session discovery, and stale cleanup) alongside a broad set of agent/gateway/infra additions (model routing + approval workflow plumbing, cron/metrics/cache utilities, and new trading skills).
The main merge blocker is that multiple newly-added `skills/trading/*.mjs` scripts embed a hardcoded Finnhub API token as a runtime fallback when `FINNHUB_KEY` is not set. This is a committed secret and will be used by default in some environments, which is unsafe and should be removed before merging.
Beyond that, the session persistence module is self-contained and currently only referenced by its tests in this changeset; if it’s intended to be active in production, it still needs wiring into the gateway startup/shutdown and message-processing flow.
<h3>Confidence Score: 2/5</h3>
- Not safe to merge until committed secret(s) are removed.
- Multiple new trading skill scripts include a hardcoded Finnhub API token fallback, which is a clear security issue. Other changes look structurally reasonable, but the presence of committed secrets is a hard blocker for merge.
- skills/trading/news-sentiment.mjs, skills/trading/options-flow.mjs, skills/trading/portfolio-risk.mjs
<!-- greptile_other_comments_section -->
<!-- /greptile_comment -->
Most Similar PRs
#13889: feat: Slack channel cache, session cost alerts & checkpoint/recover...
by trevorgordon981 · 2026-02-11
75.2%
#15571: feat: infrastructure foundation — hooks, model failover, sessions, ...
by tangcruz · 2026-02-13
74.7%
#11250: fix: expand skills watcher ignore list and improve session repair l...
by zhangzhefang-github · 2026-02-07
74.6%
#13882: feat: Enhance session checkpoint system with better types and valid...
by trevorgordon981 · 2026-02-11
74.3%
#16244: feat(gateway): add session files API and external skill management
by wanquanY · 2026-02-14
73.6%
#14576: Fix/memory loss bugs
by ENCHIGO · 2026-02-12
72.9%
#12884: Feature/named persistent sessions
by dylanb · 2026-02-09
72.9%
#4664: fix: per-session metadata files to eliminate lock contention
by tsukhani · 2026-01-30
72.8%
#22568: fix(gateway): bump skills snapshot version on startup so sessions r...
by zwffff · 2026-02-21
72.3%
#16061: fix(sessions): tolerate invalid sessionFile metadata
by haoyifan · 2026-02-14
71.9%