How Model Switching Changes Margin

Model migrations should be evaluated in margin terms, not just benchmark gains, especially when one workflow calls the model repeatedly.

Question

How do I quantify margin impact when switching from model A to model B?

Quick answer

Formulas: margin_delta_pct = margin_candidate_pct - margin_baseline_pct, cost_delta_per_user = cost_candidate - cost_baseline

Assumption: keep usage, retrieval, and caching assumptions identical across both runs.
Assumption: compare models under the same pricing snapshot context.
Assumption: evaluate quality risk separately from pure cost/margin math.

Example: if baseline margin is 74.0% and candidate margin is 79.5%, margin delta is +5.5 pts.

Comparison Method

Keep request volume and token profile constant.
Run model A assumptions, then model B assumptions.
Compare monthly cost delta per active user.
Convert cost delta into margin points at your current price.

Common Switching Mistake

A cheaper candidate model can still hurt margin if lower answer quality increases retries, fallback traffic, or human escalation. Check the saved unit cost only after you account for any extra workflow steps the lower-cost model triggers in production.

When the Stronger Model Is the Wrong Default

A stronger model can reduce retries and still be too expensive as the default route. Use it where one failed pass is costly, then route routine work to the cheapest model that preserves the needed success rate.

The margin question is whether the quality gain offsets higher unit price, cap burn, or extra workflow steps.