#21182: feat(litellm): enhance LiteLLM provider with model discovery and prompt caching

by hiboute open 2026-02-19 19:09 View on GitHub →

docs cli commands agents size: XL

## Summary Builds on the base LiteLLM onboarding from PR #12823 with the remaining enhancements: - Interactive model discovery from `/v1/models` and `/model/info` endpoints - Automatic Claude model detection with `anthropic-messages` API support - Prompt caching support for Anthropic models via LiteLLM proxy - CLI flags: `--litellm-base-url`, `--litellm-model` for non-interactive setup - Context window and `maxTokens` auto-detection from LiteLLM proxy - Robust error handling with API key and base URL retry flows - Enhanced documentation with prompt caching and multi-model examples ## Test plan - [x] Verify LiteLLM onboarding flow with interactive model selection - [x] Test non-interactive setup with `--litellm-base-url` and `--litellm-model` flags - [x] Validate prompt caching with Anthropic models through LiteLLM proxy - [x] Run existing test suites (`models-config.providers.litellm.test.ts`, `onboard-non-interactive.litellm.test.ts`, `cache-ttl.test.ts`)  <h3>Greptile Summary</h3> Enhances LiteLLM provider with interactive model discovery and prompt caching support for Anthropic models through the LiteLLM proxy. **Key improvements:** - Interactive model discovery from `/v1/models` and `/model/info` endpoints with automatic context window and `maxTokens` detection - Automatic detection of Claude models with `anthropic-messages` API support for proper prompt caching functionality - Retry flow when model fetch fails (re-enter API key, base URL, or cancel) - CLI flags `--litellm-base-url` and `--litellm-model` for non-interactive setup - Base URL stored in auth profile metadata for implicit provider resolution - Prompt caching automatically enabled for Claude models via LiteLLM (10x cost savings on cached context) - Comprehensive test coverage for new functionality <h3>Confidence Score: 5/5</h3> - This PR is safe to merge with minimal risk - The implementation follows established patterns in the codebase, includes comprehensive test coverage (3 new test files), properly handles error cases with retry flows, and maintains backward compatibility. The changes are well-structured, type-safe, and align with the existing provider integration patterns. - No files require special attention <sub>Last reviewed commit: 85f41ec</sub>  <sub>(5/5) You can turn off certain types of comments like style [here](https://app.greptile.com/review/github)!</sub>