#4097: fix: classify AWS SSO token errors as auth for model fallback (AI-assisted)
agents
Cluster:
Model Fallback and Error Handling
## Summary
AWS SSO auth failures (common with Amazon Bedrock when SSO expires) were not recognized as auth/failover errors, which prevented model fallbacks from triggering.
This change adds a few AWS SSO-specific substrings to the auth failover patterns so these errors are classified as `auth` and will trigger model fallback instead of surfacing as "Agent failed before reply".
## Changes
- Add SSO-related patterns to `ERROR_PATTERNS.auth` in `src/agents/pi-embedded-helpers/errors.ts`:
- `"sso session token"`
- `"error loading sso token"`
- `"was not found or is invalid"`
## Testing
- ✅ `pnpm install --frozen-lockfile`
- ✅ `pnpm run lint`
- ✅ `pnpm run test`
## AI Disclosure
This PR was AI-assisted. The change is minimal, reviewed, and tested locally.
<!-- greptile_comment -->
<h2>Greptile Overview</h2>
<h3>Greptile Summary</h3>
This PR expands the auth/failover error classification by adding AWS SSO-related substrings to `ERROR_PATTERNS.auth` in `src/agents/pi-embedded-helpers/errors.ts`, so common Bedrock/AWS SDK SSO credential resolution failures are treated as `auth` and trigger model fallback rather than surfacing as a generic agent failure.
The change fits into the existing error-handling pipeline by influencing `isAuthErrorMessage()` and therefore `classifyFailoverReason()`, which drives whether the system attempts model failover on certain categories of errors.
<h3>Confidence Score: 4/5</h3>
- This PR is likely safe to merge; it is a small, low-risk change to string matching that affects failover classification.
- Only adds three string patterns to an existing auth error matcher; main risk is accidental overmatching causing unnecessary failover, not crashes or security regressions.
- src/agents/pi-embedded-helpers/errors.ts (pattern specificity/false positives)
<!-- greptile_other_comments_section -->
<sub>(2/5) Greptile learns from your feedback when you react with thumbs up/down!</sub>
<!-- /greptile_comment -->
Most Similar PRs
#11821: fix(auth): trigger failover on 401 status code from expired OAuth t...
by AnonO6 · 2026-02-08
79.3%
#10178: fix: trigger fallback when model resolution fails with unknown model
by Yida-Dev · 2026-02-06
78.9%
#17531: fix(auth): sync Codex CLI credentials into auth profile store and c...
by sauerdaniel · 2026-02-15
77.6%
#15815: Fallback LLM doesn't trigger if primary model is local
by shihanqu · 2026-02-13
77.5%
#7229: fix: add network error resilience to agentic loop failover
by ai-fanatic · 2026-02-02
77.1%
#6464: fix: trigger model failover on malformed tool-call JSON
by ai-fanatic · 2026-02-01
76.9%
#12314: fix: treat HTTP 5xx server errors as failover-worthy
by hsssgdtc · 2026-02-09
76.7%
#21033: fix(failover): classify connection errors as timeout for model fail...
by zerone0x · 2026-02-19
76.6%
#21491: fix: classify Google 503 UNAVAILABLE as transient failover [AI-assi...
by ZPTDclaw · 2026-02-20
76.2%
#21152: fix(agents): throw FailoverError for unknown model so fallback chai...
by Mellowambience · 2026-02-19
75.7%