- Baseline
- $0.01958
- Candidate
- $0.02024
- Delta
- +$0.00067
Step 1 Baseline and Candidate Models
Set the current model as baseline and the proposed model as candidate.Step 2 Quick Mode
Set shared usage and pricing assumptions before tuning advanced variables.Steady usage, moderate context, and balanced quality-cost tradeoffs.
Step 3 Advanced Assumptions
Tune retrieval, reranking, embeddings, vector, caching, and infra.Show advanced inputs
Scenario actions
Copy scenario URL
Paste into ChatGPT or Claude, or share with a teammate.
Save and track this scenario
Track pricing drift on this scenario and get an email if the latest result changes.
How tracking works
After you click Save and track, we carry this exact calculator state into the tracked-scenarios page so you can sign in and confirm the save.
We save your assumptions and the pricing snapshot used for this result.
When a newer pricing snapshot lands, we recompute the same scenario, show what changed, and email you if the latest result moved.
1 tracked scenario free, then $12/mo or $120/yr for up to 25 tracked scenarios.
Decision Signal
Margin impact is smallSwitching to Claude Sonnet 4.6 changes gross margin by -0.2% and cost per user by +$0.0598.
Baseline cost / user / month
$1.7618Candidate cost / user / month
$1.8216Margin delta
-0.2%Monthly cost impact
+$38.9Top Cost Drivers
Largest baseline sensitivity shifts when each variable is increased by 10%.Totals
Baseline vs candidate totals and deltas under the same usage assumptions.Monthly gross profit impact uses 650 active users.
- Baseline
- $1.7618
- Candidate
- $1.8216
- Delta
- +$0.0598
- Baseline
- 95.5%
- Candidate
- 95.3%
- Delta
- -0.2%
- Baseline
- $1.7618
- Candidate
- $1.8216
- Delta
- +$0.0598
- Delta
- -$38.9
| Metric | Baseline | Candidate | Delta |
|---|---|---|---|
| Cost per request | $0.01958 | $0.02024 | +$0.00067 |
| Cost per user/month | $1.7618 | $1.8216 | +$0.0598 |
| Gross margin % | 95.5% | 95.3% | -0.2% |
| Break-even price | $1.7618 | $1.8216 | +$0.0598 |
| Monthly gross profit impact | -$38.9 |
Component Breakdown (USD/user/month)
Each component is computed independently then summed for both models.- Baseline
- $0.6345
- Candidate
- $0.675
- Delta
- +$0.0405
- Baseline
- $0.225
- Candidate
- $0.27
- Delta
- +$0.045
- Baseline
- $1.62
- Candidate
- $1.62
- Delta
- $0
- Baseline
- $0
- Candidate
- $0
- Delta
- $0
- Baseline
- $0.0014
- Candidate
- $0.0014
- Delta
- $0
- Baseline
- $-0.7551
- Candidate
- $-0.7807
- Delta
- -$0.0257
- Baseline
- $0.036
- Candidate
- $0.036
- Delta
- $0
| Component | Baseline | Candidate | Delta |
|---|---|---|---|
| GenerationModel input/output token spend for requests. | $0.6345 | $0.675 | +$0.0405 |
| RetrievalExtra model input spend from retrieved context chunks. | $0.225 | $0.27 | +$0.045 |
| RerankingReranker cost based on docs scored per request. | $1.62 | $1.62 | $0 |
| Embeddings IngestionAmortized per-user share of the fixed monthly corpus embedding refresh cost. | $0 | $0 | $0 |
| Vector DbVector database query cost across all requests. | $0.0014 | $0.0014 | $0 |
| CacheSavings from cache hits. Negative means lower total cost. | $-0.7551 | $-0.7807 | -$0.0257 |
| InfraNon-model infra overhead per request. | $0.036 | $0.036 | $0 |
Sensitivity RankingBaseline sensitivity: cost change when one variable is increased by 10%.
| Variable | Delta cost % |
|---|---|
| Requests Per User MonthUser activity level per month. | 10.0% |
| Rerank DocsDocs reranked per request. | 6.4% |
| Cache Hit RateFraction of requests served by cache. | -4.3% |
| Output TokensGenerated tokens per request. | 1.7% |
| Retrieved ChunksRetrieved chunk count per request. | 0.9% |
| Tokens Per ChunkAverage chunk size in tokens. | 0.9% |
| Input TokensPrompt-side tokens per request. | 0.8% |
| Vector Queries Per RequestVector query count per request. | 0.0% |
| Monthly Active UsersActive-user estimate used to amortize fixed monthly embedding refresh. | -0.0% |
Assumptions and Units
- CurrencyUSD
- Token unittoken
- Pricing snapshot2026-04-09
- Baseline modelOpenAI/GPT-5.4
- Candidate modelAnthropic/Claude Sonnet 4.6
- Comparison ruleUsage, retrieval, and fixed monthly terms stay shared across both runs
Recommended Next Step
Use this section to turn your model switch results into the next infrastructure or routing checks.If retrieval or hosting is driving cost, check infra options first, then come back to re-check the switch math.
Compare infra providers
View Infra RecommendationsSources and Snapshot
Pricing comes from the current dated snapshot.Active Pricing Rows
Baseline
OpenAI / GPT-5.4
- Input tokens$2.5 / 1M
- Output tokens$15 / 1M
Candidate
Anthropic / Claude Sonnet 4.6
- Input tokens$3 / 1M
- Output tokens$15 / 1M
Shared retrieval defaults
- Embedding input$0.02 / 1M
- Rerank docs$1 / 1K
- Snapshot date: 2026-04-09
- Source links and update notes: Pricing Snapshot Reference
Continue Analysis
Move to the next tool or guide without losing your current scenario.Switch tools
- AI Workflow Cost
- Break-even Price
- Retrieval Cost
- Rerank Cost
- Cache Savings
- Prompt Overhead
- RAG or Long Prompt
- Indexing Cost
- Browse all tools
Read guides