← Back to PRs

#19463: fix: suppress undici TLS setSession crash instead of exiting

by abbudjoe open 2026-02-17 20:49 View on GitHub →
cli size: M
## Summary Suppresses the known undici TLS `setSession` null pointer crash (`TypeError: Cannot read properties of null (reading 'setSession')`) instead of killing the gateway process. The error is logged as a warning and the process continues normally. ## Problem The gateway crashes when undici attempts TLS session resumption on a socket whose internal `_handle` has already been destroyed. This is a race condition in undici's HTTP/1.1 connection pool: when a TLS socket closes, undici immediately tries to reconnect via `_resume()`, calling `tls.connect()` with a cached session — but `this._handle` is already null. This crash: - Kills the gateway process (restart takes 2-5s depending on orchestrator) - Drops all in-flight requests silently - Disconnects IM channels (Telegram, Discord, etc.) - Can trigger crash loops under heavy HTTPS traffic Tracked in nodejs/undici#3869. Affects Node 22+ with undici 7.x. ## Why it's safe to suppress - The socket was already closing — no in-flight data is lost - undici creates a fresh connection on the next request automatically - No application state is corrupted - The error is purely in the connection lifecycle, not in request/response handling ## Changes ### `src/infra/unhandled-rejections.ts` - **`isUndiciTlsSessionBug(err)`** — Narrowly detects this specific crash by checking: 1. Error is a `TypeError` 2. Message contains `reading 'setSession'` 3. Stack trace contains both `TLSSocket.setSession` AND `undici` All three conditions required to avoid false positives. - **`installUncaughtExceptionHandler()`** — Centralized handler that suppresses the TLS bug with `console.warn` while preserving `process.exit(1)` for all other uncaught exceptions. ### Entry points (`src/index.ts`, `src/cli/run-main.ts`, `src/macos/relay.ts`) - Replaced inline `process.on("uncaughtException", ...)` handlers with the centralized `installUncaughtExceptionHandler()` - No behavioral change for non-TLS errors ### Tests - **`uncaught-exception.test.ts`** — 7 unit tests for `isUndiciTlsSessionBug()` covering exact match, path variations, wrong error types, wrong messages, non-undici stacks, null inputs, and missing stacks - **`uncaught-exception.handler.test.ts`** — 3 integration tests verifying the handler suppresses the TLS bug without exiting, still exits on unknown exceptions, and still exits on unrelated TypeErrors All 10 new tests + 10 existing tests pass. ## AI Disclosure - [x] AI-assisted (built with OpenClaw/Claude) - [x] Fully tested (unit + integration, all passing) Fixes #16206 Fixes #19168 Ref #16335 <!-- greptile_comment --> <h3>Greptile Summary</h3> This PR fixes a real, well-documented Node 22 + undici 7.x race condition (`TypeError: Cannot read properties of null (reading 'setSession')`) that was killing the gateway process. The approach — narrowly detecting the specific bug by requiring all three of: `TypeError`, the exact `reading 'setSession'` message substring, and both `TLSSocket.setSession` and `undici` in the stack — is conservative and unlikely to suppress unrelated errors. The refactoring to a centralized `installUncaughtExceptionHandler()` is clean and consistent with the existing `installUnhandledRejectionHandler()` pattern. Key observations: - **Duplicate registration risk**: `installUncaughtExceptionHandler` uses `process.on(...)` with no idempotency guard. If called more than once (which can occur in tests or if entry points change), duplicate listeners accumulate, causing double-logging and multiple `process.exit(1)` calls on a single non-TLS exception. The existing `installUnhandledRejectionHandler` has the same design, but this PR compounds the risk by adding a second unguarded installer. - **Test teardown gap**: `uncaught-exception.handler.test.ts` installs the handler in `beforeAll` with no corresponding `afterAll` cleanup to `removeListener`. Combined with the existing `unhandled-rejections.fatal-detection.test.ts` doing the same for the rejection handler, both listeners remain live for the entire Vitest worker. If another test file later emits `uncaughtException`, the (now restored real) `process.exit` inside the handler would actually kill the test worker. - **Detection coverage is appropriate**: The three-condition check is tight enough to avoid false positives while covering the known crash signature. Tests cover the main paths (exact match, path variations, wrong type, wrong message, non-undici stacks, null inputs, missing stack). <h3>Confidence Score: 4/5</h3> - Safe to merge; the core fix is correct and narrowly scoped, with one non-critical architectural concern around idempotency of the handler installer. - The bug being fixed is real and well-understood, the detection heuristic is appropriately conservative, and the entry-point changes are straightforward refactors. The main concern — lack of an idempotency guard on `installUncaughtExceptionHandler` — is a pre-existing pattern in the codebase (same issue exists on `installUnhandledRejectionHandler`) and does not cause a regression in the normal single-call path. The test teardown issue is minor and consistent with existing test files. - src/infra/unhandled-rejections.ts (idempotency guard), src/infra/uncaught-exception.handler.test.ts (handler not removed in afterAll) <sub>Last reviewed commit: af1b9f5</sub> <!-- greptile_other_comments_section --> <sub>(2/5) Greptile learns from your feedback when you react with thumbs up/down!</sub> <!-- /greptile_comment -->

Most Similar PRs