#23175: feat(security): runtime safety — transcript retention, tool call budget, token reuse detection

by ihsanmokhlisse open 2026-02-22 02:54 View on GitHub →

size: M

## Summary - **Problem:** Three operational safety gaps: (1) session transcripts accumulate forever with no cleanup, (2) a looping agent can execute unlimited tool calls burning API credits indefinitely, (3) the audit doesn't detect when the same token is reused across channels. - **Why it matters:** Users report agents looping and wasting tokens (#r/AI_Agents complaints). Transcripts containing credentials pile up on disk. Token reuse means revoking one channel breaks all. - **What changed:** Three new safety utilities + audit check. 21 tests. All standalone, no existing code modified. - **What did NOT change:** No existing behavior modified. Utilities are ready to be wired into hooks/config. ## Change Type (select all) - [x] Feature - [x] Security hardening ## Scope (select all touched areas) - [x] Auth / tokens - [x] Memory / storage - [x] Skills / tool execution ## Linked Issue/PR - Related #7916 (credential security) - Related #18245 (Credential Firewall) ## What was added ### 1. Transcript retention (`purgeOldTranscripts`) Configurable auto-purge of old session JSONL files: ```typescript purgeOldTranscripts(sessionsDir, maxAgeDays) // Returns: { scannedFiles, deletedFiles, deletedPaths, errors } ``` - Only deletes `.jsonl` files (won't touch `sessions.json` or other state) - `maxAgeDays=0` disables purging (safe default) - Handles missing/empty directories gracefully - Returns detailed results for logging ### 2. Tool call budget (`ToolCallBudget`) Per-session tool call limit to prevent runaway agent loops: ```typescript const budget = new ToolCallBudget(200); budget.check(sessionKey); // Throws ToolCallBudgetExceeded after 200 calls budget.reset(sessionKey); // Reset on new session ``` - Tracks separate sessions independently - `limit=0` disables the budget (opt-in) - `ToolCallBudgetExceeded` error includes limit + current count - `reset()` per-session or `resetAll()` for cleanup ### 3. Channel token reuse detection (`collectChannelTokenReuseFindings`) Audit finding that detects when the same plaintext token value is used across multiple channel configs: ``` ⚠ credentials.token_reuse_across_channels (warn) Same token used at: channels.telegram.botToken, channels.discord.token Revoking one would break all. ``` - Checks: Telegram botToken, Discord token, Slack botToken/appToken, Signal password, Teams appPassword, Mattermost password - Skips env var references (different channels can safely reference the same env var if needed) - Returns per-duplicate finding with affected paths ## Security Impact (required) - New permissions/capabilities? `No` - Secrets/tokens handling changed? `No` - New/changed network calls? `No` - Command/tool execution surface changed? `No` - Data access scope changed? `No` — transcript purge only deletes files in the sessions directory ## Evidence - [x] 21 new tests, all passing - [x] 0 lint errors (oxlint) - [x] 0 format issues (oxfmt) - [x] Test breakdown: transcript retention (7), tool call budget (8), token reuse detection (6) - [x] Transcript retention tests use real temp directories with actual file timestamps ## Human Verification (required) - Verified: All 21 tests pass locally with real filesystem operations - Edge cases: empty/missing directories, disabled retention (maxAgeDays=0), disabled budget (limit=0), env var references, empty channel configs - What I did **not** verify: Runtime hook wiring (utilities only in this PR) ## Compatibility / Migration - Backward compatible? `Yes` — new files only - Config/env changes? `No` - Migration needed? `No` ## Failure Recovery (if this breaks) - Zero risk — standalone utility modules ## Risks and Mitigations - Risk: `purgeOldTranscripts` deletes user data - Mitigation: Only deletes `.jsonl` files in the sessions directory. Disabled by default (maxAgeDays=0). Returns detailed results so callers can log. - Risk: Tool call budget blocks legitimate long operations - Mitigation: Disabled by default (limit=0). Configurable per-session. Reset on new session start. ## AI-Assisted - [x] This PR was AI-assisted (Claude) - [x] Fully tested (21 tests) - [x] I understand what the code does Made with [Cursor](https://cursor.com)  <h3>Greptile Summary</h3> Adds three standalone runtime safety utilities to prevent operational issues: transcript retention (auto-purge old session files), tool call budget (prevents runaway agent loops), and channel token reuse detection (audit check for duplicate credentials). ## Changes - New file `src/security/runtime-safety.ts` with three safety utilities (190 lines) - Comprehensive test suite in `src/security/runtime-safety.test.ts` (241 lines, 21 tests) - All utilities are opt-in with safe defaults (retention disabled at `maxAgeDays=0`, budget disabled at `limit=0`) ## Issues Found - **Critical bug** in `collectChannelTokenReuseFindings`: incorrect field names for Mattermost (`botToken` not `password`) and Signal (no `password` field exists). This breaks the token reuse detection for these channels. <h3>Confidence Score: 3/5</h3> - Safe to merge after fixing the critical field name bug in token reuse detection - The PR introduces useful safety utilities with comprehensive tests (21 tests covering edge cases). However, there's a critical bug where `collectChannelTokenReuseFindings` uses incorrect field names for Mattermost and Signal channels, causing the token reuse detection to fail for these channels. The other two utilities (transcript retention and tool call budget) appear solid with correct logic and good test coverage. Code is well-structured, properly typed, and follows repository conventions. - Pay close attention to `src/security/runtime-safety.ts:152-157` — the channel field mapping needs correction before merge <sub>Last reviewed commit: 64fb384</sub>