Cost per request
$0.03109Decision Signal
BaselineCurrent gross margin under this sample scenario is 91.4%.
This sample assumes 500 active users, 80 requests/user/month, 600 prompt tokens, 250 output tokens, GPT-5 Mini, $29/user/month pricing, and optional retrieval layers still turned on.
High margin here is driven partly by the sample pricing assumption. Try your own price to validate realism.
Step 1 Provider and Model
Switch model assumptions using prices from the selected snapshot.Step 2 Quick Mode
Use plain-language assumptions first. Open Advanced assumptions only if needed.Baseline starts from the generic sample scenario before workflow-specific presets.
- Need help estimating inputs? Read token sizing and reranking basics.
Step 3 Advanced Assumptions
Tune retrieval, reranking, embeddings, vector, caching, and infra.Show advanced inputs
Only adjust these once your Quick Mode assumptions feel realistic.
Scenario Share URL
Share this link to load these exact assumptions. Pricing uses the published snapshot shown on the page.Paste into ChatGPT or Claude to discuss this scenario.
Cost / user / month
$2.487Gross margin
91.4%Estimated monthly AI cost
$1,243.52Estimated monthly gross profit
$13,256.48Top Cost Drivers
Most sensitive variables when each is moved up by 10%.Totals
Summary metrics for monthly unit economics and margin.Cost per user/month
$2.487Gross margin
91.4%Break-even price
$2.487| Cost per request | $0.03109 |
| Cost per user/month | $2.487 |
| Gross margin % | 91.4% |
| Break-even price | $2.487 |
Component Breakdown (USD/user/month)
Each cost component is computed independently and summed.Largest cost block: reranking, not generation.
Generation
$0.052Retrieval
$0.0144Reranking
$2.4Embeddings Ingestion
$0Vector Db
$0.0006Cache
$-0Infra
$0.02| GenerationModel input/output token spend for requests. | $0.052 |
| RetrievalExtra model input spend from retrieved context chunks. | $0.0144 |
| RerankingReranker cost based on docs scored per request. | $2.4 |
| Embeddings IngestionAmortized per-user share of the fixed monthly corpus embedding refresh cost. | $0 |
| Vector DbVector database query cost across all requests. | $0.0006 |
| CacheSavings from cache hits. Negative means lower total cost. | $-0 |
| InfraNon-model infra overhead per request. | $0.02 |
Sensitivity RankingChange in total cost when one variable is increased by 10%.
| Variable | Delta cost % |
|---|---|
| Requests Per User MonthUser activity level per month. | 10.0% |
| Rerank DocsDocs reranked per request. | 9.7% |
| Output TokensGenerated tokens per request. | 0.2% |
| Retrieved ChunksRetrieved chunk count per request. | 0.1% |
| Tokens Per ChunkAverage chunk size in tokens. | 0.1% |
| Input TokensPrompt-side tokens per request. | 0.0% |
| Vector Queries Per RequestVector query count per request. | 0.0% |
| Monthly Active UsersActive-user estimate used to amortize fixed monthly embedding refresh. | -0.0% |
| Cache Hit RateFraction of requests served by cache. | 0.0% |
Assumptions and Units
Explicit assumptions to keep outputs reproducible and auditable.- CurrencyUSD
- Token unittoken
- Pricing snapshot2026-03-15
- Selected model rowOpenAI/GPT-5 Mini
- Monthly users500 active users for business totals and fixed-term amortization
- Embedding refreshAmortized per user from the fixed monthly corpus refresh term
- Cache componentNegative value means cost savings
Recommended Next Step
Use these links to lower top cost drivers without guessing.Optimize the biggest modeled cost driver first. Compare infra only after model, retrieval, reranking, or context changes stop being the better lever.
Optimize cost drivers first
LLM Model Cost ComparisonReranking Cost CalculatorContext Window Cost CalculatorCompare infra providers
View Infra RecommendationsSources and Snapshot
Pricing comes from a daily snapshot generated by batch workflows.Active Pricing Row
Selected model
OpenAI / GPT-5 Mini
- Input tokens$0.25 / 1M
- Output tokens$2 / 1M
Shared retrieval defaults
- Embedding input$0.02 / 1M
- Rerank docs$1 / 1K
- Snapshot date: 2026-03-15
- Source links and update notes: Pricing Snapshot Reference
Continue Analysis
Move to the next tool or guide without losing your current scenario.Switch tools
- AI Break-even Price Calculator
- LLM Model Cost Comparison
- RAG Retrieval Cost Calculator
- Reranking Cost Calculator
- Cache Savings Simulator
- Context Window Cost Calculator
- RAG vs Long-Context Calculator
- Embedding Ingestion Cost Calculator
- Browse all tools
Read guides