#23542: fix/hf inference

by Josephrp open 2026-02-22 12:10 View on GitHub →

channel: bluebubbles channel: discord channel: line channel: mattermost channel: msteams channel: voice-call channel: zalo channel: zalouser app: web-ui gateway cli commands agents channel: feishu size: XL

Cluster: Model Configuration Fixes

## Summary - **Problem:** Using a model ref with a `huggingface/` prefix (or “huggingface” as preface) caused errors, and the HF Inference client was given the full string including the prefix even though the API expects only a Hub-style model id and optional tags. - **Why it matters:** Users choosing Hugging Face inference hit failures or wrong API behavior; the client must receive only the Hub id and tags (e.g. `:cheapest`, `:fastest`). - **What changed:** Added `modelIdForHfInferenceClient()` to strip a leading `huggingface/` before calling the HF client; Pi embedded runner uses it when resolving Hugging Face models so the resolved Model `id` never includes the prefix. Removed duplicate `buildHuggingfaceProvider` in `models-config.providers.ts`. Added tests for the new helper. - **What did NOT change (scope boundary):** Other providers, auth, onboarding, and tag parsing (`:cheapest`, `:fastest`, `:provider`, etc.) are unchanged; only the string sent to the HF inference client is normalized. ## Change Type (select all) - [x] Bug fix - [ ] Feature - [ ] Refactor - [ ] Docs - [ ] Security hardening - [ ] Chore/infra ## Scope (select all touched areas) - [ ] Gateway / orchestration - [x] Skills / tool execution - [ ] Auth / tokens - [ ] Memory / storage - [ ] Integrations - [ ] API / contracts - [ ] UI / DX - [ ] CI/CD / infra ## Linked Issue/PR - Closes # https://github.com/openclaw/openclaw/issues/23481 - Related # https://github.com/openclaw/openclaw/pull/23475 [????](https://github.com/openclaw/openclaw/pulls) ## User-visible / Behavior Changes - **Before:** Prefixed refs (e.g. `huggingface/org/model`) could error or send the prefix to the HF API. - **After:** Same refs work; the client receives only the Hub-style id and tags (e.g. `org/model:cheapest`). No change to non-HF providers or to how users write refs in config/CLI. ## Security Impact (required) - New permissions/capabilities? **No** - Secrets/tokens handling changed? **No** - New/changed network calls? **No** - Command/tool execution surface changed? **No** - Data access scope changed? **No** - If any Yes, explain risk + mitigation: **N/A** ## Repro + Verification ### Environment - OS: (e.g. macOS 15.4 / Ubuntu 24.04 / Windows 11) - Runtime/container: Node 22+ / pnpm dev - Model/provider: Hugging Face Inference (HF_TOKEN or HUGGINGFACE_HUB_TOKEN) - Integration/channel (if any): N/A (inference/model resolution) - Relevant config (redacted): `models.providers.huggingface` and/or onboarding with Hugging Face selected ### Steps 1. Set HF token (e.g. `HF_TOKEN` or `HUGGINGFACE_HUB_TOKEN`). 2. Use a model ref that includes the `huggingface/` prefix (e.g. `huggingface/mistralai/Mistral-7B-Instruct-v0.3` or a path that previously sent the prefix). 3. Run an agent/inference flow that resolves the model and calls the HF inference client. ### Expected - No error from the prefix; HF API is called with only the Hub-style model id (and tags); inference works. ### Actual (before fix) - Error when using the preface, or HF client received the full prefixed string and API failed or behaved incorrectly. ## Evidence Attach at least one: - [ ] Failing test/log before + passing after (unit tests for `modelIdForHfInferenceClient`; Pi embedded runner model resolution with HF) - [ ] Trace/log snippets - [x] Screenshot/recording (to be added) - [ ] Perf numbers (if relevant) ## Human Verification (required) - **Verified scenarios:** Unit tests for `modelIdForHfInferenceClient` (strip prefix, preserve tags, no change when no prefix). Model resolution path in Pi embedded runner uses `normalizedModelId` for Hugging Face so resolved Model `id` has no prefix. - **Edge cases checked:** Ref with prefix + tags; ref without prefix; duplicate provider definition removed so build/lint pass. - **What you did not verify:** some currently failing test, but i'll stay up-to-date from now on :-) ## Compatibility / Migration - Backward compatible? **Yes** - Config/env changes? **No** - Migration needed? **No** - If yes, exact upgrade steps: N/A ## Failure Recovery (if this breaks) - **How to disable/revert:** Revert this PR; prefixed refs may fail again and the client may receive the prefix. - **Files/config to restore:** `src/agents/huggingface-models.ts`, `src/agents/pi-embedded-runner/model.ts`, `src/agents/models-config.providers.ts`, `src/agents/huggingface-models.test.ts`. - **Known bad symptoms:** HF inference errors or “invalid model id” from the API; regression in model resolution when provider is Hugging Face. ## Risks and Mitigations - **Risk:** A caller elsewhere might rely on the resolved Model `id` still containing `huggingface/` for display or routing. **Mitigation:** Normalization is applied only for the Hugging Face provider in the Pi embedded runner resolution path; display/CLI can still show the user’s original ref; HF API contract requires no prefix. - **Risk:** None others identified for this change.  <h3>Greptile Summary</h3> This PR fixes a bug where Hugging Face Inference API calls were failing when model references included the `huggingface/` prefix. The fix adds a `modelIdForHfInferenceClient()` helper that strips the prefix before sending model IDs to the HF client, while preserving routing tags like `:cheapest` and `:fastest`. Key changes: - Added `modelIdForHfInferenceClient()` in `src/agents/huggingface-models.ts:28` to normalize model IDs for the HF API - Updated `resolveModel()` in `src/agents/pi-embedded-runner/model.ts:62` to apply normalization for HuggingFace provider - Added comprehensive test coverage for mixed-case prefixes and tag preservation - Most file changes (68/72 files) are from merge conflict resolution, not the core fix The implementation correctly handles edge cases including mixed-case prefixes and preserves routing tags. The normalization is scoped to only the HuggingFace provider to avoid impacting other providers. <h3>Confidence Score: 5/5</h3> - This PR is safe to merge with minimal risk - The core fix is well-tested and narrowly scoped to the HuggingFace provider. The implementation correctly handles edge cases (mixed-case prefixes, routing tags, empty strings). Most changed files are from merge conflict resolution rather than new logic. The fix addresses a real bug without introducing behavioral changes to other providers. - No files require special attention <sub>Last reviewed commit: 1ec89f4</sub>