#7336: feat(channels): add reliability infrastructure
Cluster:
Session Management Enhancements
- Add retry logic with exponential backoff and jitter
- Add idempotency store for deduplicating message delivery
- Add structured error handling with ChannelError types
- Add channel health monitoring infrastructure
- Add streaming response abstraction
- Extend typing signal API
<!-- greptile_comment -->
<h2>Greptile Overview</h2>
<h3>Greptile Summary</h3>
This PR introduces a set of reliability primitives for channel adapters: (1) a retry helper with exponential backoff + jitter, (2) an in-memory idempotency store and wrapper for deduplicating message handling, (3) structured channel error types/codes with basic detection/wrapping helpers, (4) a channel health registry/provider abstraction with optional periodic monitoring, (5) a streaming-response abstraction with throttled update handling, and (6) a unified typing-signal API with per-channel adapters and auto-stop.
These utilities live under `src/channels/{reliability,health,streaming}` and extend the existing typing module (`src/channels/typing.ts`) so channel implementations can share consistent retry/idempotency/error/stream/typing patterns rather than each adapter re-implementing them.
<h3>Confidence Score: 2/5</h3>
- This PR adds useful infrastructure but has a few correctness/resource-leak issues that should be addressed before merge.
- The core abstractions are reasonable, but there are several functional edge cases that can lead to incorrect behavior under load (idempotency TOCTOU, stream final flush dropping the last chunk, health provider replacement leaking intervals, maxSize not enforced) and some contract ambiguity (typing adapter stopTyping vs supportsExplicitStop). Fixing these should make the changes safe.
- src/channels/reliability/idempotency.ts, src/channels/streaming/stream-response.ts, src/channels/health/channel-health.ts
<!-- greptile_other_comments_section -->
<!-- /greptile_comment -->
Most Similar PRs
#13889: feat: Slack channel cache, session cost alerts & checkpoint/recover...
by trevorgordon981 · 2026-02-11
74.6%
#14741: feat: telegram resilience utilities
by kalachbeg · 2026-02-12
74.4%
#13820: feat(agents): retry empty-stream once before fallback
by Louise-Qiuqiu · 2026-02-11
74.0%
#12995: feat(infra): Add retry with exponential backoff for transient failures
by trevorgordon981 · 2026-02-10
73.4%
#13882: feat: Enhance session checkpoint system with better types and valid...
by trevorgordon981 · 2026-02-11
73.3%
#13881: fix: Address Greptile feedback - test isolation and channel resolution
by trevorgordon981 · 2026-02-11
73.3%
#7141: fix(telegram): unify network error detection to prevent poll crashes
by hclsys · 2026-02-02
72.9%
#6463: fix(telegram): improve timeout handling and prevent channel exits
by ai-fanatic · 2026-02-01
72.3%
#12999: feat(agents): Add streaming response metrics tracking
by trevorgordon981 · 2026-02-10
71.9%
#23727: Fix Telegram channel resolution drift across announce + message sen...
by SmithLabsLLC · 2026-02-22
71.8%