#20183: fix(memory): index reset/deleted session transcripts
gateway
size: M
Cluster:
Memory Management Fixes
## Summary
- Problem: session indexing only considered files ending in `.jsonl`, so archived transcripts (`.jsonl.reset.*`, `.jsonl.deleted.*`) were excluded from memory/QMD recall.
- Why it matters: sessions reset/deleted via `/new` and pruning disappear from recall/search even though transcript content still exists on disk.
- What changed:
- expanded session transcript discovery to include active + archived transcript patterns
- triggered session transcript update events when archive renames happen (`reset`/`deleted`)
- updated incremental session sync to always index newly seen session paths (while keeping hash-based dedupe for unchanged files)
- added/updated tests for filename matching, file discovery, QMD export reuse behavior with archived session files, archive event emission, and dirty-mode archived-path indexing branches
- What did NOT change (scope boundary):
- no changes to archive naming scheme or retention policy
- no changes to non-session memory indexing behavior
## Change Type (select all)
- [x] Bug fix
- [ ] Feature
- [ ] Refactor
- [ ] Docs
- [ ] Security hardening
- [ ] Chore/infra
## Scope (select all touched areas)
- [ ] Gateway / orchestration
- [ ] Skills / tool execution
- [ ] Auth / tokens
- [x] Memory / storage
- [x] Integrations
- [ ] API / contracts
- [ ] UI / DX
- [ ] CI/CD / infra
## Linked Issue/PR
- Closes #
- Related #
## User-visible / Behavior Changes
Archived session transcripts (`*.jsonl.reset.*`, `*.jsonl.deleted.*`) are now included in memory indexing/recall flows, and newly archived files are picked up without requiring a full reindex.
## Security Impact (required)
- New permissions/capabilities? (`No`)
- Secrets/tokens handling changed? (`No`)
- New/changed network calls? (`No`)
- Command/tool execution surface changed? (`No`)
- Data access scope changed? (`Yes`)
- If any `Yes`, explain risk + mitigation:
- Scope now includes archived session transcripts that already exist under the same agent sessions directory.
- Mitigation: index dedupe remains hash-based; path resolution remains constrained to the agent sessions directory.
## Repro + Verification
### Environment
- OS: macOS
- Runtime/container: Node 22 + pnpm
- Model/provider: memory backends (sqlite + qmd path)
- Integration/channel (if any): session transcript archive/reset flows
- Relevant config (redacted): default agent `main`
### Steps
1. Create session transcript(s), then archive/reset to produce `.jsonl.reset.*`/`.jsonl.deleted.*`.
2. Run memory sync/update path.
3. Observe indexed/exported session set.
### Expected
- Archived transcripts are included, and unchanged files are not re-indexed repeatedly.
### Actual
- Matches expected.
## Evidence
- [x] Failing test/log before + passing after
- [ ] Trace/log snippets
- [ ] Screenshot/recording
- [ ] Perf numbers (if relevant)
## Human Verification (required)
What you personally verified (not just CI), and how:
- Verified scenarios:
- Full tests pass: `pnpm test`
- Lint passes: `pnpm lint`
- Targeted tests covering this change pass:
- `src/memory/session-files.test.ts`
- `src/memory/qmd-manager.test.ts`
- `src/gateway/session-utils.fs.test.ts`
- `src/memory/sync-session-files.test.ts`
- `src/memory/manager.session-delta-archived.test.ts`
- Edge cases checked:
- `.jsonl` active files still included
- `.jsonl.bak.*` excluded
- archived files included without duplicate reindex when unchanged
- What you did **not** verify:
- Live production QMD instance behavior outside test harness
## Compatibility / Migration
- Backward compatible? (`Yes`)
- Config/env changes? (`No`)
- Migration needed? (`No`)
- If yes, exact upgrade steps:
## Failure Recovery (if this breaks)
- How to disable/revert this change quickly:
- Revert commits `eb2e85404` and `7a27bbcf0`
- Files/config to restore:
- `src/memory/session-files.ts`
- `src/memory/manager-sync-ops.ts`
- `src/memory/sync-session-files.ts`
- `src/gateway/session-utils.fs.ts`
- Known bad symptoms reviewers should watch for:
- archived transcripts missing from recall
- excessive repeated indexing churn
## Risks and Mitigations
- Risk: additional archived transcripts increase indexed corpus size.
- Mitigation: existing hash-based dedupe + stale-path cleanup remain in place.
<!-- greptile_comment -->
<h3>Greptile Summary</h3>
This PR expands session memory indexing to include archived transcripts (`.jsonl.reset.*`, `.jsonl.deleted.*`) that were previously excluded because file discovery only matched `.jsonl`. It adds event emission when archives are created, ensures incremental sync picks up newly discovered paths via a DB lookup (`knownPaths`), and short-circuits delta-threshold checks for archived files in the batch processing pipeline. The changes are well-scoped, maintain backward compatibility, and include thorough test coverage across all modified paths.
- Expanded `listSessionFilesForAgent` to discover `reset`/`deleted` archived transcripts alongside active `.jsonl` files, while still excluding `.bak` archives
- `archiveFileOnDisk` now emits `emitSessionTranscriptUpdate` for `reset`/`deleted` reasons, allowing the memory manager's session listener to pick up newly archived files
- `processSessionDeltaBatch` bypasses delta-threshold checks for archived paths (archived files are complete snapshots, not incrementally growing)
- Both `syncSessionFiles` (standalone) and the class-method version in `MemoryManagerSyncOps` add a `knownPaths` DB lookup to ensure newly discovered files are indexed even during incremental syncs
- The `knownPaths` logic is duplicated between `sync-session-files.ts` and `manager-sync-ops.ts` — these serve different callers (testable standalone function vs. runtime class method) but may present a maintenance risk if the core logic diverges
- Test coverage added/updated across 5 test files covering filename matching, file discovery, event emission, delta batch processing, QMD export reuse, and incremental sync deduplication
<h3>Confidence Score: 4/5</h3>
- This PR is safe to merge with minimal risk — changes are well-tested and backward compatible
- Score reflects solid implementation with thorough test coverage across all code paths. The core logic (regex matching, event emission, knownPaths optimization, archived path handling) is correct. Minor deductions for the duplicated knownPaths logic between sync-session-files.ts and manager-sync-ops.ts which creates a maintenance surface, and a minor style note about the `isIndexableSessionTranscriptFileName` regex accepting the edge case of files without timestamps (e.g., `foo.jsonl.reset`), though this is harmless in practice since `archiveFileOnDisk` always appends timestamps.
- `src/memory/manager-sync-ops.ts` — contains duplicated session sync logic with `sync-session-files.ts` that must be kept in sync manually
<sub>Last reviewed commit: 7a27bbc</sub>
<!-- greptile_other_comments_section -->
<!-- /greptile_comment -->
Most Similar PRs
#20148: fix(memory): persist session dirty state and fix reindex gate
by togotago · 2026-02-18
78.4%
#19313: fix(session-memory): preserve rotated transcript selection on /new
by ayanesakura · 2026-02-17
78.4%
#14576: Fix/memory loss bugs
by ENCHIGO · 2026-02-12
78.2%
#6653: fix: persist archived session entry on /new or /reset
by leicao-me · 2026-02-01
77.8%
#12296: security: persistence-only secret redaction for session transcripts
by akoscz · 2026-02-09
76.4%
#17639: fix: Memory indexer skips session files
by MisterGuy420 · 2026-02-16
75.9%
#16061: fix(sessions): tolerate invalid sessionFile metadata
by haoyifan · 2026-02-14
75.5%
#20188: fix: Update sessionFile path when rolling to new session in cron jobs
by jriff · 2026-02-18
75.1%
#4664: fix: per-session metadata files to eliminate lock contention
by tsukhani · 2026-01-30
74.8%
#23639: fix(agents): stop re-resizing session history images on every turn ...
by yinghaosang · 2026-02-22
74.7%