#21561: runner: add usage preflight guard for near-limit requests
agents
size: L
Cluster:
Surrogate Pair Handling Fixes
## Summary
- Problem: near-quota requests can fail late and create poor UX.
- Fix: add a fail-open usage preflight guard (warn/block thresholds) before prompt send.
- Runner behavior: blocked preflight returns user-facing `payloads[].isError=true` and skips overflow-compaction fallback.
- Scope boundary: no surrogate sanitization changes in this PR; no public contract expansion for `EmbeddedPiRunMeta.error.kind`.
- AI-assisted disclosure: AI-assisted implementation, then manual review and manual test verification.
## Quick Review (3-5 min)
1. `src/agents/pi-embedded-runner/usage-preflight.ts`: threshold logic + fail-open cache/timeout behavior.
2. `src/agents/pi-embedded-runner/run/attempt.ts`: where preflight is invoked.
3. `src/agents/pi-embedded-runner/run.ts`: early return for `UsagePreflightError` as user-facing error payload.
4. Tests: warning-only, hard-block, fail-open, and no compaction fallback on preflight block.
## Change Type (select all)
- [x] Bug fix
- [x] Feature
- [ ] Refactor
- [ ] Docs
- [ ] Security hardening
- [ ] Chore/infra
## Scope (select all touched areas)
- [x] Gateway / orchestration
- [ ] Skills / tool execution
- [x] Auth / tokens
- [ ] Memory / storage
- [ ] Integrations
- [ ] API / contracts
- [ ] UI / DX
- [ ] CI/CD / infra
## Linked Issue/PR
- Closes #
- Related #21557
## User-visible / Behavior Changes
- Low remaining quota can emit warning logs.
- Critically low remaining quota can proactively block a request before provider call.
- If usage telemetry is unavailable/unreliable, guard fails open (no proactive block).
- Blocked preflight returns explicit error payload and does not enter overflow-compaction retry path.
## Security Impact (required)
- New permissions/capabilities? (`Yes/No`) No
- Secrets/tokens handling changed? (`Yes/No`) No
- New/changed network calls? (`Yes/No`) Yes
- Command/tool execution surface changed? (`Yes/No`) No
- Data access scope changed? (`Yes/No`) No
- If any `Yes`, explain risk + mitigation:
- Risk: pre-send usage fetch adds dependency/latency.
- Mitigation: short timeout, small TTL cache, and fail-open semantics.
## Repro + Verification
### Environment
- OS: macOS (Apple Silicon)
- Runtime/container: Node 22 / pnpm 10
- Integration/channel: N/A (runner-level tests with usage mocks)
### Steps
1. Mock remaining quota windows at warning and critical thresholds.
2. Run preflight evaluation and runner flow.
3. Verify warn-only behavior, hard-block behavior, and fail-open when usage API data is unavailable.
4. Verify blocked preflight does not continue into overflow-compaction fallback.
### Expected
- Warn near low quota.
- Block only at critical conditions.
- Fail open on telemetry outage.
- Return clear user-facing blocked payload.
### Actual
- Matches expected.
## Evidence
- [x] Failing test/log before + passing after
- [x] Trace/log snippets
- [ ] Screenshot/recording
- [ ] Perf numbers (if relevant)
## Human Verification (required)
- `corepack pnpm vitest run src/agents/pi-embedded-runner/usage-preflight.test.ts src/agents/pi-embedded-runner/run.overflow-compaction.test.ts`
- `corepack pnpm oxlint --type-aware` on touched files
- Manual verification after follow-up patch to keep existing error-context formatting behavior in `run.ts`.
- Not verified here: full repo `corepack pnpm tsgo` due pre-existing unrelated TS2742 baseline failures.
## Compatibility / Migration
- Backward compatible? (`Yes/No`) Yes
- Config/env changes? (`Yes/No`) No
- Migration needed? (`Yes/No`) No
- If yes, exact upgrade steps:
## Failure Recovery (if this breaks)
- How to disable/revert quickly: revert commits `61eac88536b9cf976d08d780419f6d59d36d0027` and `5c759c117d9852dbe1e5cd07ab826b8d635d4af2`.
- Files/config to restore:
- `src/agents/pi-embedded-runner/usage-preflight.ts`
- `src/agents/pi-embedded-runner/run/attempt.ts`
- `src/agents/pi-embedded-runner/run.ts`
## Risks and Mitigations
- Risk: threshold heuristics may be too strict or too lenient per provider/account.
- Mitigation: bounded heuristics, fail-open behavior, focused boundary tests, and linked policy issue #21557 for maintainer tuning.
Most Similar PRs
#19636: fix(agents): harden overflow recovery observability + subagent term...
by Jackten · 2026-02-18
73.5%
#21560: runner: sanitize invalid UTF-16 surrogates in session/prompt payloads
by VontaJamal · 2026-02-20
70.0%
#19551: fix(reply): make overflow fallback deterministic for empty recovery...
by Jackten · 2026-02-17
69.6%
#19878: fix: Handle compaction when fallback model has smaller context window
by gaurav10gg · 2026-02-18
68.4%
#17392: Add testing infrastructure and expand gateway OAuth scopes
by jordanhubbard · 2026-02-15
67.5%
#23175: feat(security): runtime safety — transcript retention, tool call bu...
by ihsanmokhlisse · 2026-02-22
67.2%
#23814: Gateway: block unauthenticated tool-invocation HTTP surfaces
by bmendonca3 · 2026-02-22
67.0%
#22140: feat(config): add usageDefault to agent defaults for persistent /us...
by Mellowambience · 2026-02-20
66.3%
#17445: fix(pi-embedded): add aggregate timeout to compaction retry + harde...
by joeykrug · 2026-02-15
66.1%
#13686: Add opt-in rate limiting and token-based budgets for external API c...
by ShresthSamyak · 2026-02-10
65.6%