← Back to PRs

#23794: fix(delivery): move permanently-failed queue entries to failed/ immediately

by aldoeliacim open 2026-02-22 18:05 View on GitHub →
size: S trusted-contributor
## Summary Detect non-recoverable delivery errors and move them to `failed/` immediately instead of retrying up to `MAX_RETRIES` times. ## Problem When a delivery fails with a permanent/structural error (e.g. `No conversation reference found`, `chat not found`, `bot was blocked`), the queue retries it on every gateway restart. This causes: 1. Stale entries accumulating indefinitely in `~/.openclaw/delivery-queue/` 2. Duplicate messages when previously-delivered messages are re-sent after restart 3. Recovery time budget exceeded by unresolvable entries, blocking legitimate retries Fixes #23777 ## Changes - **`delivery-queue.ts`**: Added `isPermanentDeliveryError()` that matches known non-recoverable error patterns. During recovery, entries hitting permanent errors are moved to `failed/` immediately (retryCount=0) instead of incrementing retryCount. - **`outbound.test.ts`**: Added tests for `isPermanentDeliveryError` (7 permanent + 5 transient patterns) and a recovery integration test confirming permanent errors trigger immediate move-to-failed. ## Permanent error patterns detected - `No conversation reference found` (Teams) - `chat not found` (Telegram) - `user not found` - `bot was blocked by the user` (Telegram) - `Forbidden: bot was kicked` (Telegram) - `chat_id is empty` - `recipient is not a valid` - `Outbound not configured for channel` ## Testing All 51 tests in `outbound.test.ts` pass, including 9 new tests. 🤖 AI-assisted (Claude) — fully tested <!-- greptile_comment --> <h3>Greptile Summary</h3> Adds immediate failure detection for non-recoverable delivery errors. When delivery fails with permanent errors (missing conversation references, blocked bots, invalid recipients), entries are moved to `failed/` directory immediately instead of retrying up to `MAX_RETRIES` times. - Prevents stale entries from accumulating indefinitely in the delivery queue - Eliminates unnecessary retries for structural errors that will never succeed - Maintains existing retry behavior for transient errors (network issues, timeouts, rate limits) <h3>Confidence Score: 5/5</h3> - This PR is safe to merge with minimal risk - The implementation is well-designed with comprehensive test coverage (12 test cases including pattern matching and integration tests). The change follows existing error handling patterns in the codebase, uses clear regex patterns that match actual error messages from Telegram and MS Teams channels, and includes proper fallback behavior for move-to-failed errors. The logic is simple and isolated to the recovery path, reducing risk of unintended side effects. - No files require special attention <sub>Last reviewed commit: 1faee8e</sub> <!-- greptile_other_comments_section --> <!-- /greptile_comment -->

Most Similar PRs