← Back to PRs

#19710: feat(nvidia): expand NIM model catalog from 3 to 38 models

by 88plug open 2026-02-18 03:28 View on GitHub →
agents size: M
## Summary Expands the NVIDIA NIM free-tier model catalog from the current 3 legacy models to a comprehensive 38-model catalog, verified against the live NIM API as of 2026-02-18. ### Changes - **Remove 1 dead model**: `nvidia/mistral-nemo-minitron-8b-8k-instruct` (returns 404) - **Add 14 new models**: GPT-OSS 120B/20B, Kimi K2.5, Kimi K2 Instruct 0905, MiniMax M2.1, Qwen3 Next 80B, Magistral Small, Ministral 14B, Step 3.5 Flash, Seed OSS 36B, Nemotron Nano 9B v2, Phi-4 Mini Flash Reasoning, Gemma 3n E4B/E2B - **Fix 16 spec fields**: corrected maxTokens, contextWindow, and vision input support for DeepSeek V3.2, Mistral Large/Medium/Small, QwQ-32B, Kimi K2/K2-Thinking - **Update tests**: catalog size (≥35), vision models (≥5), and specific model assertions ### Model categories (38 total) | Category | Count | Examples | |----------|-------|---------| | Flagship chat | 18 | DeepSeek V3.2, Llama 3.3 70B, Kimi K2.5, GPT-OSS 120B | | Reasoning | 4 | QwQ-32B, Kimi K2 Thinking, Phi-4 Mini Flash Reasoning | | Vision | 5 | Llama 3.2 90B/11B Vision, Gemma 3n E4B/E2B, Phi-4 Multimodal | | Compact/code | 3 | Llama 3.2 3B, Nemotron Nano 8B, Devstral 2 123B | | Legacy | 1 | Nemotron 70B Instruct | | Mixed (vision-capable chat) | 7 | DeepSeek V3.2, Mistral Large/Medium/Small | ### Verification method All models verified via live NIM API calls. Context windows and max tokens confirmed via error-trigger technique (sending max_tokens=9999999 to elicit server-reported limits). ### What did NOT change No changes to provider connection logic, authentication, or API routing. Only the static model catalog entries and their spec fields were updated. ## Change Type (select all) - [ ] Bug fix - [x] Feature - [ ] Refactor - [ ] Docs - [ ] Security hardening - [ ] Chore/infra ## Scope (select all touched areas) - [ ] Gateway / orchestration - [ ] Skills / tool execution - [ ] Auth / tokens - [ ] Memory / storage - [x] Integrations - [ ] API / contracts - [ ] UI / DX - [ ] CI/CD / infra ## Linked Issue/PR N/A — new feature expanding existing catalog, no prior issue. ## User-visible / Behavior Changes - Users now have access to 38 NIM models (up from 3) when using the NVIDIA provider - Dead model `mistral-nemo-minitron-8b-8k-instruct` removed (was returning 404) - Corrected context windows and max tokens for existing models ## Security Impact (required) - New permissions/capabilities? `No` - Secrets/tokens handling changed? `No` - New/changed network calls? `No` — same NIM API endpoint, just more model entries - Command/tool execution surface changed? `No` - Data access scope changed? `No` ## Repro + Verification ### Environment - OS: Linux - Runtime/container: Node.js 22+ - Model/provider: NVIDIA NIM (free-tier API) ### Steps 1. Configure NVIDIA NIM as provider with a valid API key 2. Select any of the new models (e.g., `deepseek-ai/deepseek-v3.1`) 3. Send a chat message ### Expected - Model resolves correctly with proper context window and max tokens - Chat completion succeeds ### Actual (before fix) - Only 3 legacy models available, many with incorrect specs ## Evidence - [x] Failing test/log before + passing after - [x] Trace/log snippets - [ ] Screenshot/recording - [ ] Perf numbers (if relevant) ## Test plan - [x] `vitest run models-config.providers.nvidia.test.ts` — all 10 tests pass - [x] Verified model count ≥ 35 - [x] Vision models ≥ 5 with correct input arrays - [x] Reasoning models ≥ 3 - [x] Provider builds with correct baseUrl and API type ## Human Verification (required) - Verified scenarios: Multiple NIM models tested with live API calls — `deepseek-ai/deepseek-v3.1` and `kimi-k2-instruct-0905` confirmed working with real chat completions. Gateway correctly loads and registers all catalog entries. All 5 agent types initialize successfully with NIM models. - Edge cases checked: Dead model removal (404 confirmed), vision model input arrays, reasoning model flags, context window limits - What you did **not** verify: Every single model end-to-end (38 models × full conversation). Verified representative samples from each category. ## Compatibility / Migration - Backward compatible? `Yes` - Config/env changes? `No` - Migration needed? `No` ## Failure Recovery (if this breaks) - How to disable/revert this change quickly: `git revert <commit>` - Files/config to restore: `src/agents/models-config.providers.ts`, test file - Known bad symptoms reviewers should watch for: Model resolution failures, incorrect context window truncation ## Risks and Mitigations - Risk: NIM API could deprecate or rename models - Mitigation: Models verified against live API as of 2026-02-18. Specs confirmed via error-trigger technique. ## AI-assisted This PR was AI-assisted. The code is understood, tested, and verified against live API.

Most Similar PRs