- Baseline
- $0.01634
- Candidate
- $0.01627
- Delta
- -$0.00007
Step 1 Provider and Model
Choose the model row used for both baseline and candidate retrieval assumptions.Step 2 Quick Mode
Set the baseline scenario first, then tune chunk assumptions.Evaluate chunk compression before changing reranking or model tier.
Step 3 Advanced Assumptions
Adjust baseline retrieval and advanced cost assumptions only after Quick Mode is close.Show advanced inputs
Scenario actions
Copy scenario URL
Paste into ChatGPT or Claude, or share with a teammate.
Save and track this scenario
Track pricing drift on this scenario and get an email if the latest result changes.
How tracking works
After you click Save and track, we carry this exact calculator state into the tracked-scenarios page so you can sign in and confirm the save.
We save your assumptions and the pricing snapshot used for this result.
When a newer pricing snapshot lands, we recompute the same scenario, show what changed, and email you if the latest result moved.
1 tracked scenario free, then $12/mo or $120/yr for up to 25 tracked scenarios.
Headline metric
Candidate chunk plan lowers costTotal cost delta per user / month: $-0.0081
Candidate retrieval tokens / request: 1,320 vs baseline 1,680.
Cost delta / user / month
$-0.0081Retrieval token delta / request
-360Break-even delta
$-0.0081Monthly cost delta
$-5.26Totals
Baseline vs candidate totals under the same retrieval assumptions.- Baseline
- $1.9607
- Candidate
- $1.9526
- Delta
- -$0.0081
- Baseline
- 96.0%
- Candidate
- 96.0%
- Delta
- +0.0%
- Baseline
- $1.9607
- Candidate
- $1.9526
- Delta
- -$0.0081
| Metric | Baseline | Candidate | Delta |
|---|---|---|---|
| Cost per request | $0.01634 | $0.01627 | -$0.00007 |
| Cost per user/month | $1.9607 | $1.9526 | -$0.0081 |
| Gross margin % | 96.0% | 96.0% | +0.0% |
| Break-even price | $1.9607 | $1.9526 | -$0.0081 |
Component Breakdown
Baseline and candidate components are computed independently, then differenced.- Baseline
- $0.114
- Candidate
- $0.114
- Delta
- $0
- Baseline
- $0.0504
- Candidate
- $0.0396
- Delta
- -$0.0108
- Baseline
- $2.4
- Candidate
- $2.4
- Delta
- $0
- Baseline
- $0
- Candidate
- $0
- Delta
- $0
- Baseline
- $0.0018
- Candidate
- $0.0018
- Delta
- $0
- Baseline
- $-0.6536
- Candidate
- $-0.6508
- Delta
- +$0.0027
- Baseline
- $0.048
- Candidate
- $0.048
- Delta
- $0
| Component | Baseline | Candidate | Delta |
|---|---|---|---|
| GenerationModel input/output token spend for requests. | $0.114 | $0.114 | $0 |
| RetrievalExtra model input spend from retrieved context chunks. | $0.0504 | $0.0396 | -$0.0108 |
| RerankingReranker cost based on docs scored per request. | $2.4 | $2.4 | $0 |
| Embeddings IngestionAmortized per-user share of the fixed monthly corpus embedding refresh cost. | $0 | $0 | $0 |
| Vector DbVector database query cost across all requests. | $0.0018 | $0.0018 | $0 |
| CacheSavings from cache hits. Negative means lower total cost. | $-0.6536 | $-0.6508 | +$0.0027 |
| InfraNon-model infra overhead per request. | $0.048 | $0.048 | $0 |
Sensitivity RankingDelta in total cost if one variable increases by 10%.
| Variable | Cost delta % |
|---|---|
| Requests Per User MonthUser activity level per month. | 10.00% |
| Rerank DocsDocs reranked per request. | 9.22% |
| Cache Hit RateFraction of requests served by cache. | -3.33% |
| Output TokensGenerated tokens per request. | 0.32% |
| Retrieved ChunksRetrieved chunk count per request. | 0.15% |
| Tokens Per ChunkAverage chunk size in tokens. | 0.15% |
| Input TokensPrompt-side tokens per request. | 0.12% |
| Vector Queries Per RequestVector query count per request. | 0.01% |
| Monthly Active UsersActive-user estimate used to amortize fixed monthly embedding refresh. | -0.00% |
Assumptions and Units
Explicit assumptions keep this comparison reproducible.- CurrencyUSD
- Token unittoken
- Pricing snapshot2026-04-09
- Selected model rowOpenAI/GPT-5 Mini
- Comparison ruleOnly retrieval chunk assumptions change; non-retrieval inputs stay shared
- Volume basisFixed monthly terms and business totals use monthly active users as the denominator
Recommended Next Step
Use this section to translate chunk deltas into the next quality and infrastructure checks.Validate infra and retrieval assumptions, then confirm quality on sampled traffic.
Compare infra providers
View Infra RecommendationsSources and Snapshot
Pricing comes from the current dated snapshot.Active Pricing Row
Candidate
OpenAI / GPT-5 Mini
- Input tokens$0.25 / 1M
- Output tokens$2 / 1M
Shared retrieval defaults
- Embedding input$0.02 / 1M
- Rerank docs$1 / 1K
- Snapshot date: 2026-04-09
- Source links and update notes: Pricing Snapshot Reference
Continue Analysis
Move to the next tool or guide without losing your current scenario.Switch tools
- AI Workflow Cost
- Break-even Price
- Compare Model Costs
- Rerank Cost
- Cache Savings
- Prompt Overhead
- RAG or Long Prompt
- Indexing Cost
- Browse all tools
Read guides