#20551: fix(googlechat): prevent health monitor restart loop and add JWT verification logging

by FredCat32 open 2026-02-19 02:44 View on GitHub →

channel: googlechat size: XS

Cluster: Webhook Configuration and Resilience

**fix(googlechat): await abortSignal in startAccount to stop restart loop + add detailed JWT verification failure logging** Two related bugs in the Google Chat channel (introduced ~2026.2.15): 1. **Infinite health monitor restart loop** `startAccount()` resolves immediately after synchronous webhook registration (`startGoogleChatMonitor()` returns cleanup function sync). The gateway interprets this as "channel stopped" → auto-restarts every 5-10s (up to 10 attempts), repeatedly (un)registering webhook targets → incoming requests 401 during gaps. **Fix**: Await `abortSignal` after registration to keep the promise pending until explicit shutdown (standard webhook channel lifecycle pattern). 2. **Silent / undiagnosable JWT verification failures** When JWT verification fails, the error reason was hard to debug because `verification.reason` was referenced outside the for-loop scope where `verification` is declared → `ReferenceError` on failure, no useful logs. **Fix**: Capture `lastVerifyReason` outside the loop and log it on no-target matches (plus `console.error` for clarity). **Key differences from similar PR #20554** - #20554 provides a clean, minimal loop fix (great!). - This PR builds on the same abortSignal await pattern **and adds structured JWT failure logging** (with last reason capture) to make auth issues visible in logs — critical for production debugging (e.g., wrong audience, expired sig, issuer mismatch). No overlap in logging changes. **Changes** - `extensions/googlechat/src/channel.ts`: Add abortSignal await block after `startGoogleChatMonitor()`. - `extensions/googlechat/src/monitor.ts`: Introduce `let lastVerifyReason: string | undefined;`, capture inside loop, log on failure. **Testing** - Manually verified with live Google Workspace account + Tailscale Funnel exposure. - Bot starts once, stays "Running" (no restart spam in logs). - Sent test DMs → processed correctly. - Simulated JWT fail (wrong audience) → clear log: `[googlechat] JWT verification failed: invalid-audience` (or similar reason). **Closes** #13856 (related webhook event handling) **Related** #20502 / #20121 (restart loop reports – happy to close if this supersedes) **Greptile Feedback Note** Greptile flagged a scope issue at monitor.ts:238 (`verification.reason` out of scope) and minor formatting. - The intent was to fix exactly that by moving reason capture outside the loop. - If the current diff still has a lingering reference at 238, it's an oversight in the partial refactor – the variable `lastVerifyReason` is meant to replace all direct `verification.reason` accesses. - Happy to address formatting (indentation at ~line 211, extra blank lines) in a follow-up commit if requested. **Security / Backward Compat** - No new permissions, network calls, or config changes. - Fully backward compatible. - Risk: None beyond standard webhook exposure (already present). Ready for review/merge – especially if the logging addition is desired for better observability.