#22610: feat(agent): auto-refresh tool schema on unknown tool invocation
channel: msteams
agents
size: M
Cluster:
Media Handling Improvements
## Summary
Describe the problem and fix in 2–5 bullets:
- Problem: embedded runner could fail hard on `unknown tool` when tool schema was stale/out-of-sync.
- Why it matters: user-visible agent turns can fail even when the tool is actually available after schema refresh.
- What changed:
- add unknown-tool detection in embedded run loop and perform **exactly one** retry with `refreshToolSchema=true`;
- plumb refresh flag through run → attempt → tool construction → plugin resolution;
- add schema before/after diagnostic logging (fingerprint + counts);
- harden plugin refresh behavior to bypass cache read and update cache with refreshed registry.
- fix Windows-only path classification in MSTeams helper (`\\tmp\\file.txt` root-relative path treated as local), with regression test.
- What did NOT change (scope boundary):
- no retry for non-unknown-tool errors;
- no infinite retry loop (single retry only);
- no new permissions/capabilities/tools exposed.
## Change Type (select all)
- [x] Bug fix
- [x] Feature
- [ ] Refactor
- [ ] Docs
- [ ] Security hardening
- [ ] Chore/infra
## Scope (select all touched areas)
- [x] Gateway / orchestration
- [x] Skills / tool execution
- [ ] Auth / tokens
- [ ] Memory / storage
- [x] Integrations
- [ ] API / contracts
- [ ] UI / DX
- [ ] CI/CD / infra
## Linked Issue/PR
- Closes #
- Related #22610
## User-visible / Behavior Changes
- Agent can self-heal a stale tool schema on unknown-tool failure with one refresh+retry.
- MSTeams local media fallback now correctly handles Windows root-relative paths.
## Security Impact (required)
- New permissions/capabilities? (`Yes/No`) No
- Secrets/tokens handling changed? (`Yes/No`) No
- New/changed network calls? (`Yes/No`) No
- Command/tool execution surface changed? (`Yes/No`) No (same tools, refresh path only)
- Data access scope changed? (`Yes/No`) No
- If any `Yes`, explain risk + mitigation:
## Repro + Verification
### Environment
- OS: macOS local + GitHub Actions Linux/Windows
- Runtime/container: Node 22.x, pnpm 10.23.0
- Model/provider: N/A
- Integration/channel (if any): MSTeams helper tests
- Relevant config (redacted): default test configs (`vitest.e2e.config.ts`, `vitest.unit.config.ts`)
### Steps
1. Trigger unknown-tool scenario in embedded runner.
2. Observe one-time schema refresh retry.
3. Validate plugin cache refresh behavior and MSTeams Windows local-path handling.
### Expected
- Unknown-tool failure triggers one refresh+retry only.
- Refreshed schema is visible in diagnostics and reused on subsequent turns.
- Windows root-relative path is treated as local in MSTeams media handling.
### Actual
- Matches expected.
## Evidence
Attach at least one:
- [x] Failing test/log before + passing after
- [x] Trace/log snippets
- [ ] Screenshot/recording
- [ ] Perf numbers (if relevant)
Evidence links/examples:
- Windows failure before: run `22256120199`, job `64387264957` (msteams assertion failure)
- Windows pass after fix: run `22256400090`, job `64387879699`
- Local verification:
- `pnpm vitest run --config vitest.e2e.config.ts src/agents/pi-embedded-runner/run.unknown-tool-schema-refresh.e2e.test.ts`
- `pnpm vitest run --config vitest.unit.config.ts src/plugins/tools.optional.test.ts`
- `pnpm vitest run extensions/msteams/src/media-helpers.test.ts extensions/msteams/src/messenger.test.ts`
- `pnpm check`
## Human Verification (required)
What you personally verified (not just CI), and how:
- Verified scenarios:
- unknown-tool detection and single retry path;
- schema refresh plumbing through plugin loader;
- MSTeams Windows root-relative local-path classification.
- Edge cases checked:
- non-unknown prompt errors do not trigger retry fallback;
- retry is capped at one;
- cache refresh updates subsequent reads.
- What you did **not** verify:
- exhaustive end-to-end behavior for unrelated upstream baseline failures in `check`/`check-docs` jobs.
## Compatibility / Migration
- Backward compatible? (`Yes/No`) Yes
- Config/env changes? (`Yes/No`) No
- Migration needed? (`Yes/No`) No
- If yes, exact upgrade steps:
## Failure Recovery (if this breaks)
- How to disable/revert this change quickly:
- revert commits in this PR (feature + hardening + msteams fix).
- Files/config to restore:
- `src/agents/pi-embedded-runner/run.ts`
- `src/plugins/loader.ts`
- `src/plugins/tools.ts`
- `extensions/msteams/src/media-helpers.ts`
- Known bad symptoms reviewers should watch for:
- repeated retries on non-unknown errors;
- stale schema not updating after refresh;
- MSTeams local media treated as remote on Windows.
## Risks and Mitigations
List only real risks for this PR. Add/remove entries as needed. If none, write `None`.
- Risk: false-positive unknown-tool detection could trigger unnecessary retry.
- Mitigation: strict detection + prompt-error precedence + single retry guard.
- Risk: refresh path could bypass cache without persisting new registry.
- Mitigation: explicit refresh mode with cache update + loader tests.
- Risk: Windows path rule broadening could misclassify some strings.
- Mitigation: URL/data URL checks remain first; regression tests added.
Most Similar PRs
#19636: fix(agents): harden overflow recovery observability + subagent term...
by Jackten · 2026-02-18
70.6%
#11990: Fix media understanding file path suppression + image tool bare-ID ...
by robertbergman2 · 2026-02-08
69.1%
#18860: feat(agents): expose tools and their schemas via new after_tools_re...
by lan17 · 2026-02-17
69.0%
#20415: fix(extensions): use dist/ import paths for bundled extensions
by 88plug · 2026-02-18
68.7%
#9861: fix(agents): re-run tool_use/tool_result repair after limitHistoryT...
by CyberSinister · 2026-02-05
68.4%
#23716: feat : implement retry logic for transient errors and increase time...
by jayy-77 · 2026-02-22
68.2%
#23226: fix(msteams): proactive messaging, EADDRINUSE fix, tool status, ada...
by TarogStar · 2026-02-22
68.0%
#19422: fix: pass session context to plugin tool hooks in toToolDefinitions
by namabile · 2026-02-17
67.7%
#18466: fix: suppress recoverable mutating tool errors when agent already r...
by stijnhoste · 2026-02-16
67.7%
#23469: feat : add support for Anthropic server tools
by jayy-77 · 2026-02-22
67.5%