Compare Model Costs

What happens to cost and margin if I switch models?

How this tool works

This tool runs the same deterministic workflow economics model twice (baseline and candidate), then reports cost and margin deltas so you can decide if an agent, copilot, or assistant-model migration is financially justified.

How It Works

  1. Select baseline and candidate model rows from the pricing snapshot.
  2. Keep usage assumptions identical across both runs.
  3. Compare cost per user, gross margin, and break-even price deltas.

Formula

margin_delta_pct = margin_candidate_pct - margin_baseline_pct

cost_delta_per_user = cost_candidate - cost_baseline

Assumptions and Units

  • Currency: USD
  • Token unit: token
  • Both runs share the same usage and retrieval assumptions
  • Pricing source: daily pricing snapshot in repo, no runtime scraping

Related resources

AI Workflow Cost, Break-even Price, Retrieval Cost, Rerank Cost, Cache Savings, Prompt Overhead, RAG or Long Prompt, Indexing Cost, Codebase Copilot Cost per Developer, AI Analyst Workflow Cost per Analyst, Model Selection: Quality vs Unit Cost, How Model Switching Changes Margin.

Pricing snapshot: 2026-04-09Baseline: OpenAI/GPT-5.4Candidate: Anthropic/Claude Sonnet 4.6

Step 1 Baseline and Candidate Models

Set the current model as baseline and the proposed model as candidate.

Step 2 Quick Mode

Set shared usage and pricing assumptions before tuning advanced variables.

Steady usage, moderate context, and balanced quality-cost tradeoffs.

Suggested defaults are transparent and editable.

Step 3 Advanced Assumptions

Tune retrieval, reranking, embeddings, vector, caching, and infra.
Show advanced inputs

Scenario actions

Copy scenario URL

Paste into ChatGPT or Claude, or share with a teammate.

Save and track this scenario

Track pricing drift on this scenario and get an email if the latest result changes.

How tracking works

After you click Save and track, we carry this exact calculator state into the tracked-scenarios page so you can sign in and confirm the save.

We save your assumptions and the pricing snapshot used for this result.

When a newer pricing snapshot lands, we recompute the same scenario, show what changed, and email you if the latest result moved.

1 tracked scenario free, then $12/mo or $120/yr for up to 25 tracked scenarios.

Decision Signal

Margin impact is small

Switching to Claude Sonnet 4.6 changes gross margin by -0.2% and cost per user by +$0.0598.

Baseline cost / user / month

$1.7618

Candidate cost / user / month

$1.8216

Margin delta

-0.2%

Monthly cost impact

+$38.9

Top Cost Drivers

Largest baseline sensitivity shifts when each variable is increased by 10%.
Requests Per User Month10.0%
Rerank Docs6.4%
Cache Hit Rate-4.3%

Totals

Baseline vs candidate totals and deltas under the same usage assumptions.

Monthly gross profit impact uses 650 active users.

Cost per request
Baseline
$0.01958
Candidate
$0.02024
Delta
+$0.00067
Cost per user/month
Baseline
$1.7618
Candidate
$1.8216
Delta
+$0.0598
Gross margin %
Baseline
95.5%
Candidate
95.3%
Delta
-0.2%
Break-even price
Baseline
$1.7618
Candidate
$1.8216
Delta
+$0.0598
Monthly gross profit impact
Delta
-$38.9
MetricBaselineCandidateDelta
Cost per request$0.01958$0.02024+$0.00067
Cost per user/month$1.7618$1.8216+$0.0598
Gross margin %95.5%95.3%-0.2%
Break-even price$1.7618$1.8216+$0.0598
Monthly gross profit impact-$38.9

Component Breakdown (USD/user/month)

Each component is computed independently then summed for both models.
GenerationModel input/output token spend for requests.
Baseline
$0.6345
Candidate
$0.675
Delta
+$0.0405
RetrievalExtra model input spend from retrieved context chunks.
Baseline
$0.225
Candidate
$0.27
Delta
+$0.045
RerankingReranker cost based on docs scored per request.
Baseline
$1.62
Candidate
$1.62
Delta
$0
Embeddings IngestionAmortized per-user share of the fixed monthly corpus embedding refresh cost.
Baseline
$0
Candidate
$0
Delta
$0
Vector DbVector database query cost across all requests.
Baseline
$0.0014
Candidate
$0.0014
Delta
$0
CacheSavings from cache hits. Negative means lower total cost.
Baseline
$-0.7551
Candidate
$-0.7807
Delta
-$0.0257
InfraNon-model infra overhead per request.
Baseline
$0.036
Candidate
$0.036
Delta
$0
ComponentBaselineCandidateDelta
GenerationModel input/output token spend for requests.$0.6345$0.675+$0.0405
RetrievalExtra model input spend from retrieved context chunks.$0.225$0.27+$0.045
RerankingReranker cost based on docs scored per request.$1.62$1.62$0
Embeddings IngestionAmortized per-user share of the fixed monthly corpus embedding refresh cost.$0$0$0
Vector DbVector database query cost across all requests.$0.0014$0.0014$0
CacheSavings from cache hits. Negative means lower total cost.$-0.7551$-0.7807-$0.0257
InfraNon-model infra overhead per request.$0.036$0.036$0
Sensitivity RankingBaseline sensitivity: cost change when one variable is increased by 10%.
VariableDelta cost %
Requests Per User MonthUser activity level per month.10.0%
Rerank DocsDocs reranked per request.6.4%
Cache Hit RateFraction of requests served by cache.-4.3%
Output TokensGenerated tokens per request.1.7%
Retrieved ChunksRetrieved chunk count per request.0.9%
Tokens Per ChunkAverage chunk size in tokens.0.9%
Input TokensPrompt-side tokens per request.0.8%
Vector Queries Per RequestVector query count per request.0.0%
Monthly Active UsersActive-user estimate used to amortize fixed monthly embedding refresh.-0.0%

Assumptions and Units

  • CurrencyUSD
  • Token unittoken
  • Pricing snapshot2026-04-09
  • Baseline modelOpenAI/GPT-5.4
  • Candidate modelAnthropic/Claude Sonnet 4.6
  • Comparison ruleUsage, retrieval, and fixed monthly terms stay shared across both runs

Recommended Next Step

Use this section to turn your model switch results into the next infrastructure or routing checks.

If retrieval or hosting is driving cost, check infra options first, then come back to re-check the switch math.

Sources and Snapshot

Pricing comes from the current dated snapshot.

Active Pricing Rows

Baseline

OpenAI / GPT-5.4

  • Input tokens$2.5 / 1M
  • Output tokens$15 / 1M

Candidate

Anthropic / Claude Sonnet 4.6

  • Input tokens$3 / 1M
  • Output tokens$15 / 1M

Shared retrieval defaults

  • Embedding input$0.02 / 1M
  • Rerank docs$1 / 1K