← Back to PRs

#22605: fix(msteams): keep provider promise pending until abort to stop auto-restart loop

by OpakAlex open 2026-02-21 11:11 View on GitHub →
docs channel: msteams gateway size: S
## Summary Describe the problem and fix in 2–5 bullets: - **Problem:** MS Teams provider logs "starting provider (port 3978)" then immediately "auto-restart attempt N/10 in Xs" in a loop; health monitor may log "restarting (reason: stopped)" and reset the attempt counter. - **Why it matters:** The channel never stays "running"; the gateway keeps restarting the provider until "giving up after 10 restart attempts." - **What changed:** `monitorMSTeamsProvider` returns a promise that stays pending until abort + shutdown (not on first listen); added gateway-lifecycle doc and a server-channels test that a pending startAccount does not trigger auto-restart. - **What did NOT change (scope boundary):** No gateway or health-monitor logic changes; no config/API; other channels unchanged. ## Change Type (select all) - [x] Bug fix - [ ] Feature - [ ] Refactor - [x] Docs - [ ] Security hardening - [ ] Chore/infra ## Scope (select all touched areas) - [x] Gateway / orchestration - [ ] Skills / tool execution - [ ] Auth / tokens - [ ] Memory / storage - [x] Integrations - [x] API / contracts - [ ] UI / DX - [ ] CI/CD / infra ## Linked Issue/PR - Closes # - Related # ## User-visible / Behavior Changes MS Teams channel no longer enters an auto-restart loop; provider stays running until the user stops the channel or the gateway exits. New doc for extension authors: gateway channel lifecycle (startAccount contract). ## Security Impact (required) - New permissions/capabilities? **No** - Secrets/tokens handling changed? **No** - New/changed network calls? **No** - Command/tool execution surface changed? **No** - Data access scope changed? **No** - If any `Yes`, explain risk + mitigation: N/A ## Repro + Verification ### Environment - OS: any - Runtime/container: Node 22+ - Integration/channel: msteams enabled with valid credentials - Relevant config (redacted): channels.msteams.enabled, webhook.port, credentials ### Steps 1. Enable MS Teams channel and start the gateway. 2. Watch logs for "msteams" and "auto-restart". ### Expected One "starting provider (port 3978)" and "msteams provider started on port 3978"; no repeated "auto-restart attempt N/10" unless the provider actually crashes. ### Actual (before fix) "starting provider (port 3978)" followed immediately by "auto-restart attempt 1/10 in 5s", then cycle repeats with backoff up to 10 attempts. ## Evidence - [x] New test: `server-channels.test.ts` — "does not auto-restart when startAccount promise stays pending" (startAccount returns never-resolving promise → one call, running stays true). - [x] Spec: `docs/channels/gateway-lifecycle.md` — startAccount promise contract and MS Teams fix. ## Human Verification (required) - **Verified scenarios:** Unit test passes; code review of promise lifecycle (pending until abort, then resolve after shutdown). - **Edge cases checked:** No abort signal → promise never resolves (documented); normal stop uses abort. - **What you did not verify:** Live gateway with real MS Teams app (no credentials in env). ## Compatibility / Migration - Backward compatible? **Yes** - Config/env changes? **No** - Migration needed? **No** - If yes, exact upgrade steps: N/A ## Failure Recovery (if this breaks) - How to disable/revert: Disable msteams channel or revert this commit. - Files/config to restore: None. - Known bad symptoms: If abort listener did not fire, stopping the channel could hang; shutdown is still invoked on abort so behavior unchanged. ## Risks and Mitigations - **Risk:** Callers that awaited the old return value for immediate use could break. - **Mitigation:** Gateway only awaits for lifecycle (stopChannel); it does not use the resolved value. No such callers in repo. - **Risk:** None otherwise. - **Mitigation:** N/A <!-- greptile_comment --> <h3>Greptile Summary</h3> Fixes the MS Teams provider auto-restart loop by making `monitorMSTeamsProvider` return a promise that stays pending while the server is running, matching the gateway's `startAccount` contract. Previously, the promise resolved immediately after `expressApp.listen()`, causing the gateway to treat the channel as "exited" and enter a restart loop. - **`extensions/msteams/src/monitor.ts`**: Wraps the return value in a `Promise` that stays pending until the abort signal fires and shutdown completes. Without an abort signal, returns a never-resolving promise. - **`src/gateway/server-channels.test.ts`**: Adds a test confirming that a pending `startAccount` promise does not trigger auto-restart. - **`docs/channels/gateway-lifecycle.md`**: New documentation describing the `startAccount` promise contract for extension channel plugins. <h3>Confidence Score: 5/5</h3> - This PR is safe to merge — it's a targeted, well-tested fix that correctly aligns the MS Teams provider with the gateway's startAccount promise contract. - The change is minimal and focused: it wraps the existing return value in a pending promise (with abort-based resolution), which is the documented correct pattern. The fix is backed by a new test case. No gateway or health-monitor logic was changed. Early return paths for disabled/unconfigured channels are guarded by the gateway's own checks. The abort signal is always freshly created by the gateway, so there's no risk of pre-aborted signals. No new dependencies, no API changes, backward compatible. - No files require special attention <sub>Last reviewed commit: 0dc50ed</sub> <!-- greptile_other_comments_section --> <sub>(2/5) Greptile learns from your feedback when you react with thumbs up/down!</sub> <!-- /greptile_comment -->

Most Similar PRs