#21182: feat(litellm): enhance LiteLLM provider with model discovery and prompt caching
docs
cli
commands
agents
size: XL
Cluster:
Model Provider Integrations
## Summary
Builds on the base LiteLLM onboarding from PR #12823 with the remaining enhancements:
- Interactive model discovery from `/v1/models` and `/model/info` endpoints
- Automatic Claude model detection with `anthropic-messages` API support
- Prompt caching support for Anthropic models via LiteLLM proxy
- CLI flags: `--litellm-base-url`, `--litellm-model` for non-interactive setup
- Context window and `maxTokens` auto-detection from LiteLLM proxy
- Robust error handling with API key and base URL retry flows
- Enhanced documentation with prompt caching and multi-model examples
## Test plan
- [x] Verify LiteLLM onboarding flow with interactive model selection
- [x] Test non-interactive setup with `--litellm-base-url` and `--litellm-model` flags
- [x] Validate prompt caching with Anthropic models through LiteLLM proxy
- [x] Run existing test suites (`models-config.providers.litellm.test.ts`, `onboard-non-interactive.litellm.test.ts`, `cache-ttl.test.ts`)
<!-- greptile_comment -->
<h3>Greptile Summary</h3>
Enhances LiteLLM provider with interactive model discovery and prompt caching support for Anthropic models through the LiteLLM proxy.
**Key improvements:**
- Interactive model discovery from `/v1/models` and `/model/info` endpoints with automatic context window and `maxTokens` detection
- Automatic detection of Claude models with `anthropic-messages` API support for proper prompt caching functionality
- Retry flow when model fetch fails (re-enter API key, base URL, or cancel)
- CLI flags `--litellm-base-url` and `--litellm-model` for non-interactive setup
- Base URL stored in auth profile metadata for implicit provider resolution
- Prompt caching automatically enabled for Claude models via LiteLLM (10x cost savings on cached context)
- Comprehensive test coverage for new functionality
<h3>Confidence Score: 5/5</h3>
- This PR is safe to merge with minimal risk
- The implementation follows established patterns in the codebase, includes comprehensive test coverage (3 new test files), properly handles error cases with retry flows, and maintains backward compatibility. The changes are well-structured, type-safe, and align with the existing provider integration patterns.
- No files require special attention
<sub>Last reviewed commit: 85f41ec</sub>
<!-- greptile_other_comments_section -->
<sub>(5/5) You can turn off certain types of comments like style [here](https://app.greptile.com/review/github)!</sub>
<!-- /greptile_comment -->
Most Similar PRs
#11525: docs: Add LiteLLM provider documentation
by shin-bot-litellm · 2026-02-07
79.9%
#6484: docs: add LiteLLM + Nebius integration guide
by demianarc · 2026-02-01
78.6%
#6559: Fix LiteLLM reasoning-tag handling + fallback to <think> content
by Najia-afk · 2026-02-01
76.4%
#18896: CLI: add vLLM configure command
by franciscojavierarceo · 2026-02-17
75.6%
#4793: hooks: use configured model for slug generator
by yoyooyooo · 2026-01-30
75.2%
#16290: fix: add field-level validation for custom LLM provider config
by superlowburn · 2026-02-14
74.8%
#18867: fix: route slug generator LLM call through configured provider
by Celegormhenry · 2026-02-17
74.8%
#15574: fix(hooks): use configured model for llm slug generation (#15510)
by TsekaLuk · 2026-02-13
74.4%
#7568: feat(agents): add LM Studio auto-discovery and provider support
by sjseo298 · 2026-02-03
74.2%
#23286: fix: use configured model in llm-slug-generator instead of hardcoded …
by wsman · 2026-02-22
74.2%