What is context bloat in this tool?

Context bloat is non-task prompt overhead such as repeated instructions, long boilerplate, or stale history carried into each request.

What does the estimator output?

It reports per-user cost delta, overhead-share changes, and monthly savings from trimming non-task prompt tokens.

Context Window Cost Calculator

Pricing snapshot: 2026-07-19Provider: OpenAIModel: GPT-5 Mini

Step 1 Provider and Model

ProvideriModeli

Step 2 Quick Mode

Use-case preseti

Measure overhead from repeated policies, role prompts, and long history blocks.

Requests per user / monthiExpected user activity each month.Base prompt tokens / requestiCurrent non-retrieval prompt tokens sent per request.Output tokens / requestiAverage response length.Price per user / month (USD)iCurrent list price for margin context.Baseline non-task input tokens / requestiSystem prompts, boilerplate, and stale history you can trim.Target non-task input tokens / requestiMonthly active usersi

Optional Advanced assumptions

Show advanced inputs

Retrieved chunks / requestiAverage chunk count in retrieved context.Tokens per chunkiAverage chunk size in tokens.Rerank docs / requestiDocs scored by reranker each request.Embedding ingestion tokens / monthiMonthly corpus updates re-embedded.Vector queries / requestiVector DB lookups each request.Vector cost / query (USD)iAverage per-query vector DB cost.Infra cost / request (USD)iNon-model compute/network overhead.Cache hit rate (0 to 0.99)iShare of requests served from cache.

Scenario actions

Copy scenario URL

Paste into ChatGPT or Claude, or share with a teammate.

Save and track this scenario

Track pricing drift on this scenario and get an email if the latest result changes.

How tracking works

After you click Save and track, we carry this exact calculator state into the tracked-scenarios page so you can sign in and confirm the save.

We save your assumptions and the pricing snapshot used for this result.

When a newer pricing snapshot lands, we recompute the same scenario, show what changed, and email you if the latest result moved.

1 tracked scenario free, then $12/mo or $120/yr for up to 25 tracked scenarios.

Headline metric

Target prompt overhead lowers cost

Cost delta per user / month: $-0.0054

Overhead share moves from 42.0% to 23.7%.

Cost delta / user / month

$-0.0054

Monthly savings

$3.51

Input token delta / request

-240

Break-even delta

$-0.0054

Totals

Cost per request

Baseline: $0.01627
Candidate: $0.01623
Delta: -$0.00004

Cost per user/month

Baseline: $1.9526
Candidate: $1.9472
Delta: -$0.0054

Gross margin %

Baseline: 96.0%
Candidate: 96.0%
Delta: +0.0%

Break-even price

Baseline: $1.9526
Candidate: $1.9472
Delta: -$0.0054

Metric	Baseline	Candidate	Delta
Cost per request	$0.01627	$0.01623	-$0.00004
Cost per user/month	$1.9526	$1.9472	-$0.0054
Gross margin %	96.0%	96.0%	+0.0%
Break-even price	$1.9526	$1.9472	-$0.0054

Component Breakdown

Generationi

Baseline: $0.114
Candidate: $0.1068
Delta: -$0.0072

Retrievali

Baseline: $0.0396
Candidate: $0.0396
Delta: $0

Rerankingi

Baseline: $2.4
Candidate: $2.4
Delta: $0

Embeddings Ingestioni

Baseline: $0
Candidate: $0
Delta: $0

Vector Dbi

Baseline: $0.0018
Candidate: $0.0018
Delta: $0

Cachei

Baseline: $-0.6508
Candidate: $-0.649
Delta: +$0.0018

Infrai

Baseline: $0.048
Candidate: $0.048
Delta: $0

Component	Baseline	Candidate	Delta
Generationi	$0.114	$0.1068	-$0.0072
Retrievali	$0.0396	$0.0396	$0
Rerankingi	$2.4	$2.4	$0
Embeddings Ingestioni	$0	$0	$0
Vector Dbi	$0.0018	$0.0018	$0
Cachei	$-0.6508	$-0.649	+$0.0018
Infrai	$0.048	$0.048	$0

Sensitivity Rankingi

Variable	Cost delta %
Requests Per User Monthi	10.00%
Rerank Docsi	9.24%
Cache Hit Ratei	-3.33%
Output Tokensi	0.32%
Retrieved Chunksi	0.15%
Tokens Per Chunki	0.15%
Input Tokensi	0.09%
Vector Queries Per Requesti	0.01%
Monthly Active Usersi	-0.00%

Assumptions and Units

CurrencyUSD
Token unittoken
Pricing snapshot2026-07-19
Selected model rowOpenAI/GPT-5 Mini
Comparison ruleTask tokens stay fixed; only non-task prompt overhead changes
Volume basisMonthly savings and fixed monthly terms use monthly active users as the denominator

Recommended Next Step

Confirm infrastructure options and trim plan before applying prompt-template changes globally.

Prompt-token references

How Many Tokens Per Request?How To Choose Chunk Size and Chunk Count

Compare infra providers

View Infra Recommendations

Sources and Snapshot

Active Pricing Row

Candidate

OpenAI / GPT-5 Mini

Input tokens$0.25 / 1M
Output tokens$2 / 1M

Shared retrieval defaults

Embedding input$0.02 / 1M
Rerank docs$1 / 1K

Snapshot date: 2026-07-19
Source links and update notes: Pricing Snapshot Reference

Continue Analysis

Switch tools

Read guides

Prompt Overhead

How It Works

Formula

Assumptions and Units

Step 1 Provider and Model

Step 2 Quick Mode

Optional Advanced assumptions

Scenario actions

Copy scenario URL

Save and track this scenario

Headline metric

Totals

Component Breakdown

Assumptions and Units

Recommended Next Step

Sources and Snapshot

Continue Analysis