#23794: fix(delivery): move permanently-failed queue entries to failed/ immediately
size: S
trusted-contributor
Cluster:
Telegram Message Handling Fixes
## Summary
Detect non-recoverable delivery errors and move them to `failed/` immediately instead of retrying up to `MAX_RETRIES` times.
## Problem
When a delivery fails with a permanent/structural error (e.g. `No conversation reference found`, `chat not found`, `bot was blocked`), the queue retries it on every gateway restart. This causes:
1. Stale entries accumulating indefinitely in `~/.openclaw/delivery-queue/`
2. Duplicate messages when previously-delivered messages are re-sent after restart
3. Recovery time budget exceeded by unresolvable entries, blocking legitimate retries
Fixes #23777
## Changes
- **`delivery-queue.ts`**: Added `isPermanentDeliveryError()` that matches known non-recoverable error patterns. During recovery, entries hitting permanent errors are moved to `failed/` immediately (retryCount=0) instead of incrementing retryCount.
- **`outbound.test.ts`**: Added tests for `isPermanentDeliveryError` (7 permanent + 5 transient patterns) and a recovery integration test confirming permanent errors trigger immediate move-to-failed.
## Permanent error patterns detected
- `No conversation reference found` (Teams)
- `chat not found` (Telegram)
- `user not found`
- `bot was blocked by the user` (Telegram)
- `Forbidden: bot was kicked` (Telegram)
- `chat_id is empty`
- `recipient is not a valid`
- `Outbound not configured for channel`
## Testing
All 51 tests in `outbound.test.ts` pass, including 9 new tests.
🤖 AI-assisted (Claude) — fully tested
<!-- greptile_comment -->
<h3>Greptile Summary</h3>
Adds immediate failure detection for non-recoverable delivery errors. When delivery fails with permanent errors (missing conversation references, blocked bots, invalid recipients), entries are moved to `failed/` directory immediately instead of retrying up to `MAX_RETRIES` times.
- Prevents stale entries from accumulating indefinitely in the delivery queue
- Eliminates unnecessary retries for structural errors that will never succeed
- Maintains existing retry behavior for transient errors (network issues, timeouts, rate limits)
<h3>Confidence Score: 5/5</h3>
- This PR is safe to merge with minimal risk
- The implementation is well-designed with comprehensive test coverage (12 test cases including pattern matching and integration tests). The change follows existing error handling patterns in the codebase, uses clear regex patterns that match actual error messages from Telegram and MS Teams channels, and includes proper fallback behavior for move-to-failed errors. The logic is simple and isolated to the recovery path, reducing risk of unintended side effects.
- No files require special attention
<sub>Last reviewed commit: 1faee8e</sub>
<!-- greptile_other_comments_section -->
<!-- /greptile_comment -->
Most Similar PRs
#19284: fix(delivery): treat AbortErrors as failures for retry
by EdGuan · 2026-02-17
82.3%
#22993: fix(delivery): guard JSON.parse in failDelivery to prevent silent i...
by adhitShet · 2026-02-21
81.0%
#20329: Fix cron.run WS blocking and harden delivery recovery
by guirguispierre · 2026-02-18
74.7%
#22385: fix: improve delivery recovery logging with entry age and deferral ...
by derrickburns · 2026-02-21
73.6%
#17337: fix(delivery): keep route fields paired to channel during context m...
by Glucksberg · 2026-02-15
73.3%
#17953: fix(telegram): prevent silent message loss and duplicate messages i...
by zuyan9 · 2026-02-16
72.6%
#7141: fix(telegram): unify network error detection to prevent poll crashes
by hclsys · 2026-02-02
71.3%
#12936: fix(telegram): omit message_thread_id for private DM chats
by omair445 · 2026-02-09
71.3%
#20274: fix: add fallback delivery when stopSlackStream fails
by nova-openclaw-cgk · 2026-02-18
71.2%
#21195: fix: suppress orphaned tool_use/tool_result errors after session co...
by ruslansychov-git · 2026-02-19
71.0%