#16543: [AI-assisted] feat(usage): support cache-hit differentiated pricing
stale
size: S
Cluster:
Model Configuration Fixes
## Summary
Add support for cache-hit differentiated pricing in usage calculation.
## Problem
- DeepSeek: cache hit price is $0.028/M vs $0.28/M (cache miss)
- Kimi: cache hit price is $0.10/M vs $0.60/M
- MiniMax: cache write cost was missing
## Solution
Modified `estimateUsageCost()` to calculate:
- cacheMissInput * inputPrice + cacheHitInput * cacheHitPrice
## Files Changed
- `src/utils/usage-format.ts` - Core calculation logic
- `src/infra/session-cost-usage.ts` - Pass provider/model to cost function
## Testing
- [x] Gateway restarted successfully
- [ ] Need to verify with actual usage data
## Notes
- AI-assisted: Yes (ShadowAI)
- Testing: Lightly tested (needs real usage data verification)
<!-- greptile_comment -->
<h3>Greptile Summary</h3>
This PR modifies `estimateUsageCost()` in `src/utils/usage-format.ts` to apply provider-specific cache-hit pricing for DeepSeek, Moonshot (Kimi), and MiniMax, and updates two call sites in `src/infra/session-cost-usage.ts` to pass `provider` and `model` to the function.
- **Provider key mismatch**: The hardcoded `provider === "deepseek"` check will not match any built-in provider, since DeepSeek models are served under provider keys like `"qianfan"`, `"venice"`, `"together"`, etc. Users of DeepSeek via built-in providers won't receive the intended cache-hit pricing.
- **Inverted moonshot pricing**: Since Moonshot's default `cost.input` is `0`, the cache-hit logic charges $0.10/M for cache hits while cache-miss tokens are free — the opposite of the intended behavior.
- **Double-counting** (flagged in prior review): Cache read tokens are counted both in `inputCost` (line 101) and `cacheReadCost` (line 105), leading to overcharging.
- **No tests**: The existing test passes by coincidence but no new tests cover the provider-specific pricing paths.
- The `model` parameter is accepted but unused in the function body.
Consider using the existing `ModelCostConfig.cacheRead` field for cache-hit pricing (which is already configured per-model) rather than hardcoding per-provider prices in the calculation function.
<h3>Confidence Score: 1/5</h3>
- This PR has correctness issues that will produce incorrect cost calculations for users
- Score of 1 reflects multiple logic issues: provider key mismatches that make the new code unreachable for built-in providers, inverted pricing for moonshot, double-counting of cache read tokens (flagged in prior review), and no test coverage for the new paths. The cost calculation directly affects user billing visibility.
- src/utils/usage-format.ts requires significant rework to address the double-counting and provider key matching issues
<sub>Last reviewed commit: 5183ebb</sub>
<!-- greptile_other_comments_section -->
<!-- /greptile_comment -->
Most Similar PRs
#17602: fix: correct MiniMax M2.5 pricing constants (75x overcharge)
by ghostllm · 2026-02-15
73.7%
#19114: feat(usage): add default model costs for session_status
by Clawborn · 2026-02-17
71.0%
#14744: fix(context): key MODEL_CACHE by provider/modelId to prevent collis...
by lailoo · 2026-02-12
70.9%
#13877: perf: Comprehensive performance optimizations - caching, model rout...
by trevorgordon981 · 2026-02-11
69.7%
#15632: fix: use provider-qualified key in MODEL_CACHE for context window l...
by linwebs · 2026-02-13
69.5%
#6960: feat: Add kimi-coding provider support
by YYW0228 · 2026-02-02
68.8%
#17604: fix(context): use getAvailable() to prevent cross-provider model ID...
by aldoeliacim · 2026-02-16
67.9%
#21076: feat(quota): unify provider quota tracking and usage UI across prov...
by romeroej2 · 2026-02-19
67.6%
#13215: fix: pass agentId to loadCostUsageSummary in /usage cost command
by veast · 2026-02-10
67.4%
#13895: fix(usage): exclude cache tokens from context-window accounting
by zerone0x · 2026-02-11
67.3%