← Back to PRs

#19823: fix(browser): stability improvements for headless Chrome

by Milofax open 2026-02-18 06:33 View on GitHub →
size: S
## Summary - Clean stale Chrome singleton lock files before launch to prevent "already running" errors - Evict stuck browser pages before Playwright connection (fixes hung sessions) - Also evict stuck pages on cached (reused) connections - Disable WebGL in headless mode to prevent GPU renderer hangs ## Context These fixes address several browser stability issues encountered in production with headless Chrome (Playwright). Each fix targets a different failure mode: 1. **Stale singleton files** — after unclean shutdowns, Chrome leaves lock files that prevent new launches 2. **Stuck pages** — pages from previous sessions can remain in "loading" state indefinitely, blocking new connections 3. **Cached connections** — the stuck page issue also affected reused browser connections 4. **WebGL hangs** — GPU-accelerated rendering in headless mode can cause the renderer process to hang ## Test plan - [x] Existing browser tests pass - [ ] Verify in headless environment with multiple concurrent sessions - [ ] Test recovery after forced Chrome kill (singleton cleanup) <!-- greptile_comment --> <h3>Greptile Summary</h3> This PR adds four defensive measures to improve headless Chrome stability with Playwright: - **Stale singleton cleanup**: Removes Chrome's `SingletonLock`, `SingletonSocket`, and `SingletonCookie` files before launch to prevent "already running" errors after unclean shutdowns. Well-tested with three unit tests covering happy path, no-op, and selectivity. - **Stuck page eviction**: Probes all browser pages via raw CDP `Runtime.evaluate("1")` before Playwright's `connectOverCDP()` and closes unresponsive pages. This prevents the entire connection from hanging when any page's renderer is stuck. - **Cached connection eviction**: Applies the same stuck-page eviction to reused (cached) Playwright connections, not just fresh ones. - **WebGL disabled in headless**: Adds `--disable-webgl` to headless Chrome args to avoid SwiftShader software rendering hangs on heavy pages. All new CDP operations are best-effort (wrapped in `.catch(() => {})`), following existing patterns in the codebase. The `evictStuckPagesViaCdp` helper is called on every `connectBrowser` invocation (both cached and fresh paths), which adds minor latency for health-checking but provides continuous protection against stuck pages. <h3>Confidence Score: 4/5</h3> - This PR is safe to merge — all changes are defensive, best-effort, and follow established patterns. - Score of 4 reflects that all code changes are logically correct, follow existing patterns (best-effort `.catch(() => {})`, `withCdpSocket`, `fetchJson`), and include good test coverage for the singleton cleanup. Deducted 1 point because the CDP eviction functions lack unit tests and the eviction-on-every-cached-call adds latency worth monitoring in production. - Pay attention to `src/browser/pw-session.ts` — the `evictStuckPagesViaCdp` function runs on every `connectBrowser` call (including cached path) and lacks unit test coverage. <sub>Last reviewed commit: 262be2d</sub> <!-- greptile_other_comments_section --> <!-- /greptile_comment -->

Most Similar PRs