← Back to PRs

#16736: fix: stagger multi-account channel startup to avoid Discord rate limits

by rm289 open 2026-02-15 02:17 View on GitHub →
gateway stale size: XS
## Summary - When 10+ Discord bots are configured on a single gateway, restarting the gateway causes some bots to fail with `Failed to resolve Discord application id` due to Discord API rate limiting - Root cause: `startChannel()` in `server-channels.ts` uses `Promise.all()` to connect all bot accounts simultaneously, flooding Discord's `/oauth2/applications/@me` endpoint - Fix: replace parallel startup with sequential startup, adding a 2-second delay between each account connection ## Details The gateway's `startChannel()` function iterates over all configured accounts for a channel type (e.g., Discord) and calls `startAccount()` for each one. Previously this used `Promise.all(accountIds.map(...))` which fired all connections in parallel. For users with many Discord bots (13 in the case that surfaced this), this triggers Discord's rate limit on the OAuth2 application ID lookup (`fetchDiscordApplicationId()` in `src/discord/probe.ts`). That function has a 4-second timeout — when rate-limited, the call times out, returns `undefined`, and `monitorDiscordProvider()` throws `"Failed to resolve Discord application id"`, stopping that bot permanently. The fix replaces `Promise.all()` with a sequential `for` loop and a 2-second `setTimeout` between each account start (skipped for the first account). This spaces out the Discord API calls enough to stay under rate limits. **Impact:** - Gateway startup with N accounts takes ~2*(N-1) seconds longer (e.g., ~24s for 13 bots instead of instant) - Applies to all channel types, not just Discord, though Discord is the only one likely to have many accounts - No behavior change for single-account setups (no delay added for the first account) - `return` statements converted to `continue` to match the new `for` loop control flow ## Test plan - [x] Build succeeds (`pnpm build`) - [x] Existing gateway server tests pass (38/41 pass; 3 pre-existing failures from missing `https-proxy-agent` dep) - [x] Deploy to production gateway with 13 Discord bots + 1 Slack bot - [x] Restart gateway and verify all 13 Discord bots reach `running` state - [x] Verify Slack channel unaffected 🤖 Generated with [Claude Code](https://claude.com/claude-code) <!-- greptile_comment --> <h3>Greptile Summary</h3> This PR contains two independent changes: (1) a fix for Discord rate limiting during multi-account gateway startup, and (2) accumulated thinking/reasoning across all assistant messages in an agent run. - **Staggered channel startup** (`server-channels.ts`): Replaces `Promise.all()` parallel startup with a sequential `for` loop and 2-second delay between accounts, preventing Discord API rate limit failures when many bots restart simultaneously. The `lastAbort` heuristic provides basic cancellation support during the delay window. - **Accumulated reasoning** (agent files): Collects thinking blocks from ALL assistant messages in a run (not just the final one), preserving reasoning from intermediate tool-call messages. Threaded cleanly through subscribe state → attempt result → payload builder with proper fallback. - **Operational scripts** (`deploy-patch.sh`, `compact-all-sessions.sh`): New internal deployment and session compaction scripts with hardcoded infrastructure details. The compaction script has a shell injection vector where session keys are interpolated unsafely into shell commands. - **Gitignore**: Adds `AGENTS-LOCAL.md` to `.gitignore`. <h3>Confidence Score: 4/5</h3> - This PR is safe to merge with one minor security concern in an operational script. - The core logic changes (staggered startup and accumulated reasoning) are well-implemented and tested in production. The accumulated reasoning change is clean plumbing with proper fallbacks. The main concern is a shell injection vector in the compact-all-sessions script, though it's an internal operational tool with low exploitation risk. - `scripts/compact-all-sessions.sh` has a shell injection risk on line 87 where session keys are interpolated into shell commands. <sub>Last reviewed commit: c4a919c</sub> <!-- greptile_other_comments_section --> <!-- /greptile_comment -->

Most Similar PRs