#13889: feat: Slack channel cache, session cost alerts & checkpoint/recovery system
channel: slack
stale
Cluster:
Session Management Enhancements
## Summary
Comprehensive foundational implementation adding three core systems:
### 1. Slack Channel Name Resolution Cache (`src/slack/channel-cache.ts`)
- TTL-based in-memory cache for channel name ↔ ID lookups
- Forward (name→ID) and reverse (ID→name) lookups with case-insensitive matching
- LRU eviction at configurable capacity, bulk loading via `setMany()`
- Automatic expiry with `prune()` and per-entry TTL validation
- Singleton default instance with `getDefaultChannelCache()` / `resetDefaultChannelCache()`
### 2. Session Cost Alert Thresholds (`src/infra/session-cost-alerts.ts`)
- Configurable warning/critical thresholds on both cost (USD) and token count
- Per-session deduplication — each threshold fires at most once per session
- Async `onAlert` callback for plugging into notification systems (Slack, logs, etc.)
- Enable/disable toggle, per-session reset, and global reset
- Sensible defaults: warning at $0.50, critical at $2.00
### 3. Session Checkpoint/Recovery (`src/infra/session-checkpoint.ts`)
- Periodic filesystem-based session state checkpointing
- Stores transcript length, active model/provider, cost, labels, pending tool calls
- Automatic pruning of old checkpoints (configurable retention, default 3)
- Minimum interval throttling to prevent excessive writes (default 30s)
- `findRecoverableSessions()` for startup recovery discovery
- Safe filesystem naming with session ID sanitization
## Testing
All three systems include comprehensive test suites:
- `src/slack/channel-cache.test.ts` — 10 tests covering CRUD, TTL, eviction, bulk ops
- `src/infra/session-cost-alerts.test.ts` — 11 tests covering thresholds, dedup, callbacks, token triggers
- `src/infra/session-checkpoint.test.ts` — 8 tests covering save/load, pruning, recovery, throttling
## Motivation
These systems form foundational infrastructure for reliable, cost-aware, and resilient agent operation. They share common session lifecycle patterns and were developed together to ensure consistent design.
## Breaking Changes
None. All features are additive — new files only, no modifications to existing code.
<!-- greptile_comment -->
<h2>Greptile Overview</h2>
<h3>Greptile Summary</h3>
Adds three new infra utilities:
- `src/slack/channel-cache.ts`: in-memory TTL cache for Slack channel ID/name lookups with a default singleton.
- `src/infra/session-cost-alerts.ts`: per-session cost/token threshold monitor with optional callback and default singleton.
- `src/infra/session-checkpoint.ts`: filesystem checkpoint store for persisting session state and discovering recoverable sessions, plus unit tests for each module.
Overall the changes are additive and well-covered by tests, but there are a few correctness edge cases (stale cache entries, dedupe key collisions, and checkpoint filename collisions) that should be addressed before merging.
<h3>Confidence Score: 3/5</h3>
- Mostly safe to merge, but there are a few correctness bugs that can cause stale lookups or lost checkpoints under realistic conditions.
- The PR is additive and tests cover the happy paths, but there are verified edge-case correctness issues: Slack channel rename can leave stale name mappings, cost alert dedupe can suppress thresholds with distinct token settings, and checkpoint filenames can collide on same-ms saves causing silent data loss.
- src/slack/channel-cache.ts; src/infra/session-cost-alerts.ts; src/infra/session-checkpoint.ts
<!-- greptile_other_comments_section -->
<!-- /greptile_comment -->
Most Similar PRs
#13882: feat: Enhance session checkpoint system with better types and valid...
by trevorgordon981 · 2026-02-11
86.7%
#13872: feat: Cost Optimization Suite - Session Management & Resource Effic...
by trevorgordon981 · 2026-02-11
86.2%
#13881: fix: Address Greptile feedback - test isolation and channel resolution
by trevorgordon981 · 2026-02-11
83.1%
#12997: feat(infra): Add query caching layer with TTL and LRU eviction
by trevorgordon981 · 2026-02-10
82.0%
#15571: feat: infrastructure foundation — hooks, model failover, sessions, ...
by tangcruz · 2026-02-13
80.2%
#23175: feat(security): runtime safety — transcript retention, tool call bu...
by ihsanmokhlisse · 2026-02-22
79.1%
#12954: feat(slack): Add channel name resolution with TTL cache
by trevorgordon981 · 2026-02-10
78.4%
#15050: fix: transcript corruption resilience — strip aborted tool_use bloc...
by yashchitneni · 2026-02-12
77.6%
#9012: fix(memory): resilient flush for large sessions [AI-assisted]
by cheenu1092-oss · 2026-02-04
77.3%
#4664: fix: per-session metadata files to eliminate lock contention
by tsukhani · 2026-01-30
76.9%