Agent Run Cost
Agent run cost is the unit cost of one completed agent execution after planning calls, tool-call context, retries, retrieval, cache, and infrastructure assumptions are included.
Question
How do I calculate cost per AI agent run?
Quick answer
Formula: refresh_share_usd = refresh_monthly_usd / monthly_agent_runs
Formula: cost_per_agent_run = generation + retrieval + reranking + refresh_share + vector_db + cache + infra
Formula: margin_pct = (run_price_or_chargeback - cost_per_agent_run) / run_price_or_chargeback * 100
- Assumption: one completed run can contain multiple model requests.
- Assumption: tool outputs and retrieved context are measured as prompt-side tokens.
- Assumption: fixed refresh work is amortized across monthly agent runs.
Example: if generation, retrieval, reranking, refresh share, vector, cache, and infra sum to $0.18, one completed agent run costs $0.18 before any human approval or tool-use fee is added.
Scenario Table
| Scenario | What changes | Expected effect |
|---|---|---|
| More tool steps | Model requests per completed run increase | Cost rises even if each individual model call is cheap |
| Heavier context | Prompt tokens include more tool output, traces, retrieved chunks, or state | Input-side spend and cache sensitivity move faster |
| Better cache reuse | Repeated runs reuse prior context or response blocks | Per-run cost drops without changing model quality |
Where This Fits
Use Agent Run Cost when the decision unit is one completed agent execution. Use Coding Agent Cost per Task when accepted coding output and rework are the denominator. Use Break-even Price when you already know the blended user-month cost and need the required price.
Related reads
Open companion tools: Agent Run Cost, Coding Agent Cost per Task, AI Ticket Cost
Turn run cost into price: Break-even Price
Related reads: How To Price AI Agent Usage With Credits, Caps, and Margin, How Many Tokens Per Request?
Run the Agent Run Cost Tool
Model per-run cost from model calls, context, retrieval, cache, infra, and refresh assumptions.
Open Agent Run Cost