#21561: runner: add usage preflight guard for near-limit requests

by VontaJamal open 2026-02-20 03:09 View on GitHub →

agents size: L

## Summary - Problem: near-quota requests can fail late and create poor UX. - Fix: add a fail-open usage preflight guard (warn/block thresholds) before prompt send. - Runner behavior: blocked preflight returns user-facing `payloads[].isError=true` and skips overflow-compaction fallback. - Scope boundary: no surrogate sanitization changes in this PR; no public contract expansion for `EmbeddedPiRunMeta.error.kind`. - AI-assisted disclosure: AI-assisted implementation, then manual review and manual test verification. ## Quick Review (3-5 min) 1. `src/agents/pi-embedded-runner/usage-preflight.ts`: threshold logic + fail-open cache/timeout behavior. 2. `src/agents/pi-embedded-runner/run/attempt.ts`: where preflight is invoked. 3. `src/agents/pi-embedded-runner/run.ts`: early return for `UsagePreflightError` as user-facing error payload. 4. Tests: warning-only, hard-block, fail-open, and no compaction fallback on preflight block. ## Change Type (select all) - [x] Bug fix - [x] Feature - [ ] Refactor - [ ] Docs - [ ] Security hardening - [ ] Chore/infra ## Scope (select all touched areas) - [x] Gateway / orchestration - [ ] Skills / tool execution - [x] Auth / tokens - [ ] Memory / storage - [ ] Integrations - [ ] API / contracts - [ ] UI / DX - [ ] CI/CD / infra ## Linked Issue/PR - Closes # - Related #21557 ## User-visible / Behavior Changes - Low remaining quota can emit warning logs. - Critically low remaining quota can proactively block a request before provider call. - If usage telemetry is unavailable/unreliable, guard fails open (no proactive block). - Blocked preflight returns explicit error payload and does not enter overflow-compaction retry path. ## Security Impact (required) - New permissions/capabilities? (`Yes/No`) No - Secrets/tokens handling changed? (`Yes/No`) No - New/changed network calls? (`Yes/No`) Yes - Command/tool execution surface changed? (`Yes/No`) No - Data access scope changed? (`Yes/No`) No - If any `Yes`, explain risk + mitigation: - Risk: pre-send usage fetch adds dependency/latency. - Mitigation: short timeout, small TTL cache, and fail-open semantics. ## Repro + Verification ### Environment - OS: macOS (Apple Silicon) - Runtime/container: Node 22 / pnpm 10 - Integration/channel: N/A (runner-level tests with usage mocks) ### Steps 1. Mock remaining quota windows at warning and critical thresholds. 2. Run preflight evaluation and runner flow. 3. Verify warn-only behavior, hard-block behavior, and fail-open when usage API data is unavailable. 4. Verify blocked preflight does not continue into overflow-compaction fallback. ### Expected - Warn near low quota. - Block only at critical conditions. - Fail open on telemetry outage. - Return clear user-facing blocked payload. ### Actual - Matches expected. ## Evidence - [x] Failing test/log before + passing after - [x] Trace/log snippets - [ ] Screenshot/recording - [ ] Perf numbers (if relevant) ## Human Verification (required) - `corepack pnpm vitest run src/agents/pi-embedded-runner/usage-preflight.test.ts src/agents/pi-embedded-runner/run.overflow-compaction.test.ts` - `corepack pnpm oxlint --type-aware` on touched files - Manual verification after follow-up patch to keep existing error-context formatting behavior in `run.ts`. - Not verified here: full repo `corepack pnpm tsgo` due pre-existing unrelated TS2742 baseline failures. ## Compatibility / Migration - Backward compatible? (`Yes/No`) Yes - Config/env changes? (`Yes/No`) No - Migration needed? (`Yes/No`) No - If yes, exact upgrade steps: ## Failure Recovery (if this breaks) - How to disable/revert quickly: revert commits `61eac88536b9cf976d08d780419f6d59d36d0027` and `5c759c117d9852dbe1e5cd07ab826b8d635d4af2`. - Files/config to restore: - `src/agents/pi-embedded-runner/usage-preflight.ts` - `src/agents/pi-embedded-runner/run/attempt.ts` - `src/agents/pi-embedded-runner/run.ts` ## Risks and Mitigations - Risk: threshold heuristics may be too strict or too lenient per provider/account. - Mitigation: bounded heuristics, fail-open behavior, focused boundary tests, and linked policy issue #21557 for maintainer tuning.