← Back to PRs

#11821: fix(auth): trigger failover on 401 status code from expired OAuth tokens

by AnonO6 open 2026-02-08 11:03 View on GitHub →
docs agents stale
#### Summary When an OAuth/claude-token setup token expires, the bot receives a 401 `authentication_error` and crashes instead of failing over to the configured direct API key. The failover logic for prompt errors relied solely on string matching (`classifyFailoverReason(errorText)`) which can miss 401 errors when the HTTP status code is only on the error object, not in the message text. Closes #11674 lobster-biscuit #### Repro Steps 1. Configure both `anthropic:claude-token` (OAuth) and `anthropic:default` (API key) auth profiles 2. Wait for the OAuth token to expire 3. Send a message — the bot crashes with a raw `authentication_error` instead of falling back to the API key #### Root Cause The prompt error handling path in `src/agents/pi-embedded-runner/run.ts` used `classifyFailoverReason(errorText)` which only matches against known string patterns (e.g., "authentication", "401", "unauthorized"). If the error text doesn't contain these exact strings but the HTTP error object has `status: 401`, the failover is not triggered. The fix: use `resolveFailoverReasonFromError(promptError)` which checks the error object's `status`/`statusCode` property first (catching 401, 402, 403, 408, 429), then falls back to string matching. This mirrors what already works for `resolveFailoverReasonFromError` in the model fallback path. #### Behavior Changes - Prompt errors with HTTP 401/403 status codes now trigger auth profile failover, even if the error message text doesn't match known patterns - The model fallback throw path also uses the improved status-code-aware reason detection - No change to behavior when string patterns already match (backward compatible) #### Codebase and GitHub Search - Reviewed `resolveFailoverReasonFromError` in `src/agents/failover-error.ts` — already handles 401/403 via status code - Reviewed `classifyFailoverReason` in `src/agents/pi-embedded-helpers/errors.ts` — string-only matching - Confirmed `getStatusCode` extracts `status`/`statusCode` from error objects - No existing PRs for #11674 #### Tests - Added 401 status test to `src/agents/failover-error.test.ts` (now 7 tests, all pass) - Added `coerces 401 auth errors with status code even without matching message text` test - Updated mock in `run.overflow-compaction.test.ts` to include new import - All 82 pi-embedded-runner + failover tests pass **Sign-Off** - Models used: Claude (AI-assisted) - Submitter effort: Traced error flow through prompt handling, identified string-only matching gap, implemented status-code-aware fix - Agent notes: The `resolveFailoverReasonFromError` function already existed and handled 401 correctly — the bug was that the prompt error path wasn't using it. Minimal change: 3 lines of logic + 1 import. Made with [Cursor](https://cursor.com) <!-- greptile_comment --> <h2>Greptile Overview</h2> <h3>Greptile Summary</h3> This PR fixes auth-profile failover for prompt submission errors when expired OAuth tokens return an HTTP 401/403 on the error object but the message text doesn’t contain recognizable auth keywords. It does this by using `resolveFailoverReasonFromError(promptError)` (status-code-aware) in `src/agents/pi-embedded-runner/run.ts`, falling back to the existing string-based `classifyFailoverReason(errorText)`, and adds a regression test in `src/agents/failover-error.test.ts` to cover 401 status handling. The changes integrate with the existing failover infrastructure (`FailoverError`, `resolveFailoverStatus`, auth profile rotation) by ensuring prompt-error handling uses the same error-object inspection logic already used elsewhere, so configured direct API key fallbacks can activate when OAuth tokens expire. <h3>Confidence Score: 3/5</h3> - This PR is close to mergeable but has a logic guard that can trigger failover on non-failover errors. - The intended fix (status-code-aware failover classification) is sound and covered by tests, but the new `promptFailoverReason !== null` checks can evaluate true for `undefined`, which changes control flow and can cause incorrect auth rotation / FailoverError throwing. After tightening that guard, risk should be low. - src/agents/pi-embedded-runner/run.ts <!-- greptile_other_comments_section --> <!-- /greptile_comment -->

Most Similar PRs