Indexing Cost

What does turning my docs into embeddings cost each month?

How this tool works

This calculator compares baseline and candidate monthly embedding refresh plans, then separates the fixed monthly refresh delta from the per-user economics you need for pricing and rollout decisions.

How It Works

  1. Choose the pricing row used for embedding refresh assumptions.
  2. Set baseline and candidate embedding tokens refreshed each month.
  3. Review fixed monthly cost, amortized per-user impact, and total cost share before changing indexing cadence.

Formula

embedding_ingestion_cost_monthly = embedding_ingestion_tokens_monthly * embedding_input_price_per_token

amortized_cost_per_user_month = embedding_ingestion_cost_monthly / monthly_active_users

Assumptions and Units

  • Currency: USD
  • Token unit: token
  • Embedding ingestion is treated as a fixed monthly refresh term
  • Pricing source: daily pricing snapshot in repo, no runtime scraping

Example Scenario

If monthly refresh scope drops from 1.2B to 400M tokens, compare the fixed monthly savings and the smaller per-user impact at the configured active-user count.

Related resources: Retrieval Cost, Rerank Cost, Cache Savings, Prompt Overhead, RAG or Long Prompt, RAG Cost Components Explained, How To Choose Chunk Size and Chunk Count, How Much Does an AI Agent Cost?.

Pricing snapshot: 2026-04-29Provider: OpenAIModel: GPT-5 Mini

Step 1 Provider and Model

Choose the pricing row used for monthly embedding refresh assumptions.

Step 2 Quick Mode

Set baseline and candidate monthly refresh scope before tuning the wider workflow.

Compare full refresh scope against narrower docs updates.

Optional Advanced assumptions

Tune the wider workflow only after the monthly refresh numbers look realistic.
Show advanced inputs

Scenario actions

Copy scenario URL

Paste into ChatGPT or Claude, or share with a teammate.

Save and track this scenario

Track pricing drift on this scenario and get an email if the latest result changes.

How tracking works

After you click Save and track, we carry this exact calculator state into the tracked-scenarios page so you can sign in and confirm the save.

We save your assumptions and the pricing snapshot used for this result.

When a newer pricing snapshot lands, we recompute the same scenario, show what changed, and email you if the latest result moved.

1 tracked scenario free, then $12/mo or $120/yr for up to 25 tracked scenarios.

Headline metric

Candidate refresh plan lowers fixed monthly cost

The candidate refresh plan changes fixed monthly embedding cost by -$8.00 and adds -$0.01 per active user at the current scale.

Baseline embedding cost / month

$12.00

Candidate embedding cost / month

$4.00

Amortized delta / user / month

-$0.01

Break-even delta / user / month

-$0.01

Totals

Compare the fixed monthly refresh term and its amortized per-user impact.
Embedding tokens / month
Baseline
600,000,000
Candidate
200,000,000
Delta
-400,000,000
Embedding cost / month
Baseline
$12.00
Candidate
$4.00
Delta
-$8.00
Embedding share of modeled cost
Baseline
2.3%
Candidate
0.8%
Delta
-1.5%
Amortized embedding cost / user / month
Baseline
$0.015
Candidate
$0.005
Delta
-$0.01
MetricBaselineCandidateDelta
Embedding tokens / month600,000,000200,000,000-400,000,000
Embedding cost / month$12.00$4.00-$8.00
Embedding share of modeled cost2.3%0.8%-1.5%
Amortized embedding cost / user / month$0.015$0.005-$0.01

Component Breakdown

Full workflow context helps show whether refresh is material or just noise.
GenerationModel input/output token spend for requests.
Baseline
$0.0435
Candidate
$0.0435
Delta
$0
RetrievalExtra model input spend from retrieved context chunks.
Baseline
$0.0198
Candidate
$0.0198
Delta
$0
RerankingReranker cost based on docs scored per request.
Baseline
$0.96
Candidate
$0.96
Delta
$0
Embeddings IngestionAmortized per-user share of the fixed monthly corpus embedding refresh cost.
Baseline
$0.015
Candidate
$0.005
Delta
-$0.01
Vector DbVector database query cost across all requests.
Baseline
$0.0009
Candidate
$0.0009
Delta
$0
CacheSavings from cache hits. Negative means lower total cost.
Baseline
$-0.3972
Candidate
$-0.3972
Delta
$0
InfraNon-model infra overhead per request.
Baseline
$0.021
Candidate
$0.021
Delta
$0
ComponentBaselineCandidateDelta
GenerationModel input/output token spend for requests.$0.0435$0.0435$0
RetrievalExtra model input spend from retrieved context chunks.$0.0198$0.0198$0
RerankingReranker cost based on docs scored per request.$0.96$0.96$0
Embeddings IngestionAmortized per-user share of the fixed monthly corpus embedding refresh cost.$0.015$0.005-$0.01
Vector DbVector database query cost across all requests.$0.0009$0.0009$0
CacheSavings from cache hits. Negative means lower total cost.$-0.3972$-0.3972$0
InfraNon-model infra overhead per request.$0.021$0.021$0
Sensitivity RankingBaseline sensitivity: cost change when one variable is increased by 10%.
VariableDelta cost %
Requests Per User MonthUser activity level per month.9.8%
Rerank DocsDocs reranked per request.9.0%
Cache Hit RateFraction of requests served by cache.-6.0%
Output TokensGenerated tokens per request.0.3%
Monthly Active UsersActive-user estimate used to amortize fixed monthly embedding refresh.-0.2%
Retrieved ChunksRetrieved chunk count per request.0.2%
Tokens Per ChunkAverage chunk size in tokens.0.2%
Input TokensPrompt-side tokens per request.0.1%
Vector Queries Per RequestVector query count per request.0.0%

Assumptions and Units

Explicit assumptions to keep refresh-cost planning reproducible and auditable.
  • CurrencyUSD
  • Token unittoken
  • Pricing snapshot2026-04-29
  • Selected model rowOpenAI / GPT-5 Mini
  • Fixed monthly termEmbedding refresh is modeled as a fixed monthly cost
  • AmortizationPer-user impact divides the fixed monthly refresh term by monthly active users

Recommended Next Step

Use this section to turn the refresh-cost estimate into an indexing plan.

If refresh cost is material, review infra and indexing constraints before widening your update cadence.

Sources and Snapshot

Pricing comes from the current dated snapshot.

Active Pricing Row

Active pricing row

OpenAI / GPT-5 Mini

  • Input tokens$0.25 / 1M
  • Output tokens$2 / 1M

Shared retrieval defaults

  • Embedding input$0.02 / 1M
  • Rerank docs$1 / 1K