← Back to PRs

#16543: [AI-assisted] feat(usage): support cache-hit differentiated pricing

by OwenJiong24 open 2026-02-14 21:04 View on GitHub →
stale size: S
## Summary Add support for cache-hit differentiated pricing in usage calculation. ## Problem - DeepSeek: cache hit price is $0.028/M vs $0.28/M (cache miss) - Kimi: cache hit price is $0.10/M vs $0.60/M - MiniMax: cache write cost was missing ## Solution Modified `estimateUsageCost()` to calculate: - cacheMissInput * inputPrice + cacheHitInput * cacheHitPrice ## Files Changed - `src/utils/usage-format.ts` - Core calculation logic - `src/infra/session-cost-usage.ts` - Pass provider/model to cost function ## Testing - [x] Gateway restarted successfully - [ ] Need to verify with actual usage data ## Notes - AI-assisted: Yes (ShadowAI) - Testing: Lightly tested (needs real usage data verification) <!-- greptile_comment --> <h3>Greptile Summary</h3> This PR modifies `estimateUsageCost()` in `src/utils/usage-format.ts` to apply provider-specific cache-hit pricing for DeepSeek, Moonshot (Kimi), and MiniMax, and updates two call sites in `src/infra/session-cost-usage.ts` to pass `provider` and `model` to the function. - **Provider key mismatch**: The hardcoded `provider === "deepseek"` check will not match any built-in provider, since DeepSeek models are served under provider keys like `"qianfan"`, `"venice"`, `"together"`, etc. Users of DeepSeek via built-in providers won't receive the intended cache-hit pricing. - **Inverted moonshot pricing**: Since Moonshot's default `cost.input` is `0`, the cache-hit logic charges $0.10/M for cache hits while cache-miss tokens are free — the opposite of the intended behavior. - **Double-counting** (flagged in prior review): Cache read tokens are counted both in `inputCost` (line 101) and `cacheReadCost` (line 105), leading to overcharging. - **No tests**: The existing test passes by coincidence but no new tests cover the provider-specific pricing paths. - The `model` parameter is accepted but unused in the function body. Consider using the existing `ModelCostConfig.cacheRead` field for cache-hit pricing (which is already configured per-model) rather than hardcoding per-provider prices in the calculation function. <h3>Confidence Score: 1/5</h3> - This PR has correctness issues that will produce incorrect cost calculations for users - Score of 1 reflects multiple logic issues: provider key mismatches that make the new code unreachable for built-in providers, inverted pricing for moonshot, double-counting of cache read tokens (flagged in prior review), and no test coverage for the new paths. The cost calculation directly affects user billing visibility. - src/utils/usage-format.ts requires significant rework to address the double-counting and provider key matching issues <sub>Last reviewed commit: 5183ebb</sub> <!-- greptile_other_comments_section --> <!-- /greptile_comment -->

Most Similar PRs