#23542: fix/hf inference
channel: bluebubbles
channel: discord
channel: line
channel: mattermost
channel: msteams
channel: voice-call
channel: zalo
channel: zalouser
app: web-ui
gateway
cli
commands
agents
channel: feishu
size: XL
Cluster:
Model Configuration Fixes
## Summary
- **Problem:** Using a model ref with a `huggingface/` prefix (or “huggingface” as preface) caused errors, and the HF Inference client was given the full string including the prefix even though the API expects only a Hub-style model id and optional tags.
- **Why it matters:** Users choosing Hugging Face inference hit failures or wrong API behavior; the client must receive only the Hub id and tags (e.g. `:cheapest`, `:fastest`).
- **What changed:** Added `modelIdForHfInferenceClient()` to strip a leading `huggingface/` before calling the HF client; Pi embedded runner uses it when resolving Hugging Face models so the resolved Model `id` never includes the prefix. Removed duplicate `buildHuggingfaceProvider` in `models-config.providers.ts`. Added tests for the new helper.
- **What did NOT change (scope boundary):** Other providers, auth, onboarding, and tag parsing (`:cheapest`, `:fastest`, `:provider`, etc.) are unchanged; only the string sent to the HF inference client is normalized.
## Change Type (select all)
- [x] Bug fix
- [ ] Feature
- [ ] Refactor
- [ ] Docs
- [ ] Security hardening
- [ ] Chore/infra
## Scope (select all touched areas)
- [ ] Gateway / orchestration
- [x] Skills / tool execution
- [ ] Auth / tokens
- [ ] Memory / storage
- [ ] Integrations
- [ ] API / contracts
- [ ] UI / DX
- [ ] CI/CD / infra
## Linked Issue/PR
- Closes # https://github.com/openclaw/openclaw/issues/23481
- Related # https://github.com/openclaw/openclaw/pull/23475 [????](https://github.com/openclaw/openclaw/pulls)
## User-visible / Behavior Changes
- **Before:** Prefixed refs (e.g. `huggingface/org/model`) could error or send the prefix to the HF API.
- **After:** Same refs work; the client receives only the Hub-style id and tags (e.g. `org/model:cheapest`). No change to non-HF providers or to how users write refs in config/CLI.
## Security Impact (required)
- New permissions/capabilities? **No**
- Secrets/tokens handling changed? **No**
- New/changed network calls? **No**
- Command/tool execution surface changed? **No**
- Data access scope changed? **No**
- If any Yes, explain risk + mitigation: **N/A**
## Repro + Verification
### Environment
- OS: (e.g. macOS 15.4 / Ubuntu 24.04 / Windows 11)
- Runtime/container: Node 22+ / pnpm dev
- Model/provider: Hugging Face Inference (HF_TOKEN or HUGGINGFACE_HUB_TOKEN)
- Integration/channel (if any): N/A (inference/model resolution)
- Relevant config (redacted): `models.providers.huggingface` and/or onboarding with Hugging Face selected
### Steps
1. Set HF token (e.g. `HF_TOKEN` or `HUGGINGFACE_HUB_TOKEN`).
2. Use a model ref that includes the `huggingface/` prefix (e.g. `huggingface/mistralai/Mistral-7B-Instruct-v0.3` or a path that previously sent the prefix).
3. Run an agent/inference flow that resolves the model and calls the HF inference client.
### Expected
- No error from the prefix; HF API is called with only the Hub-style model id (and tags); inference works.
### Actual (before fix)
- Error when using the preface, or HF client received the full prefixed string and API failed or behaved incorrectly.
## Evidence
Attach at least one:
- [ ] Failing test/log before + passing after (unit tests for `modelIdForHfInferenceClient`; Pi embedded runner model resolution with HF)
- [ ] Trace/log snippets
- [x] Screenshot/recording (to be added)
- [ ] Perf numbers (if relevant)
## Human Verification (required)
- **Verified scenarios:** Unit tests for `modelIdForHfInferenceClient` (strip prefix, preserve tags, no change when no prefix). Model resolution path in Pi embedded runner uses `normalizedModelId` for Hugging Face so resolved Model `id` has no prefix.
- **Edge cases checked:** Ref with prefix + tags; ref without prefix; duplicate provider definition removed so build/lint pass.
- **What you did not verify:** some currently failing test, but i'll stay up-to-date from now on :-)
## Compatibility / Migration
- Backward compatible? **Yes**
- Config/env changes? **No**
- Migration needed? **No**
- If yes, exact upgrade steps: N/A
## Failure Recovery (if this breaks)
- **How to disable/revert:** Revert this PR; prefixed refs may fail again and the client may receive the prefix.
- **Files/config to restore:** `src/agents/huggingface-models.ts`, `src/agents/pi-embedded-runner/model.ts`, `src/agents/models-config.providers.ts`, `src/agents/huggingface-models.test.ts`.
- **Known bad symptoms:** HF inference errors or “invalid model id” from the API; regression in model resolution when provider is Hugging Face.
## Risks and Mitigations
- **Risk:** A caller elsewhere might rely on the resolved Model `id` still containing `huggingface/` for display or routing.
**Mitigation:** Normalization is applied only for the Hugging Face provider in the Pi embedded runner resolution path; display/CLI can still show the user’s original ref; HF API contract requires no prefix.
- **Risk:** None others identified for this change.
<!-- greptile_comment -->
<h3>Greptile Summary</h3>
This PR fixes a bug where Hugging Face Inference API calls were failing when model references included the `huggingface/` prefix. The fix adds a `modelIdForHfInferenceClient()` helper that strips the prefix before sending model IDs to the HF client, while preserving routing tags like `:cheapest` and `:fastest`.
Key changes:
- Added `modelIdForHfInferenceClient()` in `src/agents/huggingface-models.ts:28` to normalize model IDs for the HF API
- Updated `resolveModel()` in `src/agents/pi-embedded-runner/model.ts:62` to apply normalization for HuggingFace provider
- Added comprehensive test coverage for mixed-case prefixes and tag preservation
- Most file changes (68/72 files) are from merge conflict resolution, not the core fix
The implementation correctly handles edge cases including mixed-case prefixes and preserves routing tags. The normalization is scoped to only the HuggingFace provider to avoid impacting other providers.
<h3>Confidence Score: 5/5</h3>
- This PR is safe to merge with minimal risk
- The core fix is well-tested and narrowly scoped to the HuggingFace provider. The implementation correctly handles edge cases (mixed-case prefixes, routing tags, empty strings). Most changed files are from merge conflict resolution rather than new logic. The fix addresses a real bug without introducing behavioral changes to other providers.
- No files require special attention
<sub>Last reviewed commit: 1ec89f4</sub>
<!-- greptile_other_comments_section -->
<!-- /greptile_comment -->
Most Similar PRs
#23136: fix: lookupContextTokens should handle provider/model refs
by patchguardio · 2026-02-22
78.9%
#23568: fix(agents): preserve multi-segment model IDs in splitModelRef
by arosstale · 2026-02-22
78.8%
#23286: fix: use configured model in llm-slug-generator instead of hardcoded …
by wsman · 2026-02-22
77.0%
#21998: fix(models): prioritize exact model-id match over fuzzy scoring (#2...
by lailoo · 2026-02-20
76.9%
#14744: fix(context): key MODEL_CACHE by provider/modelId to prevent collis...
by lailoo · 2026-02-12
76.7%
#23816: fix(agents): model fallback skipped during session overrides and pr...
by ramezgaberiel · 2026-02-22
76.3%
#3322: fix: merge provider config api into registry model
by nulone · 2026-01-28
75.4%
#11198: fix(models): strip @profile suffix from model selection
by mcaxtr · 2026-02-07
75.2%
#15632: fix: use provider-qualified key in MODEL_CACHE for context window l...
by linwebs · 2026-02-13
75.1%
#13626: fix(model): propagate provider model properties in fallback resolution
by mcaxtr · 2026-02-10
75.0%