#15163: fix(errors): classify connection errors as retryable failover reason
agents
stale
size: S
## Summary
- Add `connection` error patterns (`ECONNREFUSED`, `ECONNRESET`, `socket hang up`, `fetch failed`, `APIConnectionError`, etc.) to `ERROR_PATTERNS`
- Wire `isConnectionErrorMessage()` into `classifyFailoverReason()` as `"timeout"` (retryable), so the failover mechanism can retry on a different provider
- Return a friendly user-facing message instead of leaking raw error text to channels
## Test plan
- [x] New `isConnectionErrorMessage` test suite (3 tests: SDK message, common patterns, negative cases)
- [x] New `formatAssistantErrorText` tests for connection error messages (2 tests)
- [x] All 269 existing tests pass
- [x] `pnpm build && pnpm check` clean
Fixes #15083
lobster-biscuit
> 🤖 Generated with [Claude Code](https://claude.com/claude-code)
<!-- greptile_comment -->
<h2>Greptile Overview</h2>
<h3>Greptile Summary</h3>
This change adds a new `connection` bucket to the agent error pattern matcher, exposes it via `isConnectionErrorMessage()`, and uses it in two places:
- `formatAssistantErrorText()` now rewrites connection-related failures (e.g., `ECONNREFUSED`, `ECONNRESET`, `socket hang up`, `fetch failed`, `APIConnectionError`) into a stable, user-friendly message.
- `classifyFailoverReason()` treats those connection errors as retryable transport failures by mapping them to the existing `"timeout"` failover reason.
New Vitest suites cover both the message classifier and the formatting behavior.
<h3>Confidence Score: 5/5</h3>
- This PR is safe to merge with minimal risk.
- Changes are localized to error classification/formatting, are covered by new unit tests, and preserve existing failover reason types by mapping connection errors to the already-supported "timeout" category.
- No files require special attention
<sub>Last reviewed commit: 932a781</sub>
<!-- greptile_other_comments_section -->
<sub>(4/5) You can add custom instructions or style guidelines for the agent [here](https://app.greptile.com/review/github)!</sub>
<!-- /greptile_comment -->
Most Similar PRs
#21033: fix(failover): classify connection errors as timeout for model fail...
by zerone0x · 2026-02-19
85.0%
#5031: fix: add network connection error codes to failover classifier
by shayan919293 · 2026-01-30
82.8%
#21516: fix: classify connection errors as timeout for model failover (#20931)
by echoVic · 2026-02-20
81.7%
#12314: fix: treat HTTP 5xx server errors as failover-worthy
by hsssgdtc · 2026-02-09
79.2%
#19077: fix(agents): trigger model failover on connection-refused and netwo...
by ayanesakura · 2026-02-17
78.6%
#4036: fix: include cause detail in agent connection error diagnostic
by anajuliabit · 2026-01-29
78.5%
#15815: Fallback LLM doesn't trigger if primary model is local
by shihanqu · 2026-02-13
77.9%
#22359: fix(agents): classify overloaded service errors as timeout
by AIflow-Labs · 2026-02-21
76.9%
#10178: fix: trigger fallback when model resolution fails with unknown model
by Yida-Dev · 2026-02-06
75.8%
#11170: fix: classify subscription quota limit errors as rate_limit for fai...
by Yida-Dev · 2026-02-07
75.1%