← Back to PRs

#22096: fix(slack): traverse .original for Slack SDK errors; pass recipient_team_id to streaming

by maiclaw open 2026-02-20 18:11 View on GitHub →
channel: slack size: S
## Summary Two related Slack stability fixes found while debugging production crashes. ### Bug 1: Gateway crashes on transient Slack SDK network errors **Root cause:** `@slack/web-api`'s `WebAPIRequestError` wraps the original network error in `.original` (not `.cause`). The existing `isTransientNetworkError()` only traverses `.cause`, so errors like `ECONNRESET` or `ETIMEDOUT` wrapped by the Slack SDK were not recognised as transient — causing `process.exit(1)` on routine reconnect failures. **Fix:** Added `getWrappedOriginal()` helper and traverse `.original` in `isTransientNetworkError()`. ### Bug 2: `missing_recipient_team_id` when streaming in DM threads **Root cause:** `chat.startStream` requires `recipient_team_id` when streaming in a DM thread, but `client.chatStream()` in `startSlackStream()` never passed it. This caused streaming to fail with `slack-stream: streaming API call failed: Error: An API error occurred: missing_recipient_team_id` and fall back to non-streaming delivery. **Fix:** Added optional `recipientTeamId` to `StartSlackStreamParams` and forward it as `recipient_team_id`. Updated `dispatchPreparedSlackMessage` to pass `ctx.teamId`. ## Tests Added unit tests for the `.original` traversal case in `unhandled-rejections.test.ts`. <!-- greptile_comment --> <h3>Greptile Summary</h3> This PR fixes two production stability issues in the Slack integration. The first fix prevents gateway crashes by correctly traversing the `@slack/web-api` SDK's `.original` error property to detect transient network errors (like `ECONNRESET`, `ETIMEDOUT`) that were previously causing `process.exit(1)`. The second fix resolves `missing_recipient_team_id` errors when streaming in DM threads by passing the team ID parameter required by Slack's streaming API. <h3>Confidence Score: 5/5</h3> - This PR is safe to merge with minimal risk - Both fixes address well-documented production crashes with clear root causes. The implementation is clean, follows existing patterns (`ctx.teamId || undefined` matches usage in other files), and includes comprehensive unit tests for the error traversal logic. The changes are minimal and focused on the specific bugs. - No files require special attention <sub>Last reviewed commit: c08e01b</sub> <!-- greptile_other_comments_section --> <sub>(2/5) Greptile learns from your feedback when you react with thumbs up/down!</sub> <!-- /greptile_comment -->

Most Similar PRs