AI Coding Agent Cost

What does an AI coding agent cost per developer each month?

How this tool works

This tool models the full workflow cost behind an AI coding agent or coding assistant: generation, code retrieval, reranking, embedding refresh, vector lookups, cache savings, and infrastructure overhead with a daily pricing snapshot. Use it when a coding workflow looks cheap in demos but expensive once longer prompts, repository context, and premium-model routing show up at team scale.

How It Works

  1. Pick the coding model and set the monthly request volume per active developer.
  2. Model repository context, reranking, vector lookups, and fixed embedding refresh as explicit cost blocks.
  3. Compare cost per developer, break-even price, and gross margin before you widen rollout or change routing.

Formulas

embedding_ingestion_share = embedding_ingestion_monthly / monthly_active_users

cost_per_developer_month = generation + retrieval + reranking + embedding_ingestion_share + vector_db + cache + infra

gross_margin_pct = (price_per_user_month - cost_per_developer_month) / price_per_user_month * 100

Assumptions and Units

  • Currency: USD
  • Token unit: token
  • Cache component is negative when savings reduce total cost
  • Embedding refresh is amortized across active developers before it appears in per-developer totals
  • Pricing source: daily pricing snapshot in repo, no runtime scraping

Example Scenario

Use this for Claude Code, Codex, Cursor, or a repo-aware internal coding assistant where one premium coding turn can be cheap in isolation but expensive after repeated follow-ups and repository context are included.

FAQs

What should I change first if margin looks weak? Usually request volume, prompt/context size, and premium-model share move faster than small list-price differences.

Is this only for internal tools? No. Use it for internal coding assistants or commercial developer products as long as the unit you care about is active developers.

Related resources: AI Workflow Cost, Break-even Price, Compare Model Costs, Retrieval Cost, Rerank Cost, Cache Savings, Prompt Overhead, RAG or Long Prompt, Indexing Cost, AI Coding Agent Cost per Developer, Internal AI Assistant Cost per Employee, Model Selection: Quality vs Unit Cost, How Model Switching Changes Margin, How Many Tokens Per Request?, RAG Cost Components Explained.

Use this when

Pricing snapshot: 2026-04-12Provider: OpenAIModel: GPT-5.3 Codex

Decision Signal

Healthy

Current gross margin is 95.1%. Use this to choose whether pricing and costs are ready before scaling usage.

Step 1 Provider and Model

Switch model assumptions using prices from the selected snapshot.

Step 2 Quick Mode

Use plain-language assumptions first. Open Advanced assumptions only if needed.
Starting pointApply the generic baseline or a more conservative downside scenario. If neither is selected, you are working from custom inputs.
Quick workflowStart with the closest workflow shape, then fine-tune the assumptions below.

Repo-aware coding workflows where active developer usage, context load, and model routing move unit cost fast.

Step 3 Advanced Assumptions

Tune retrieval, reranking, embeddings, vector, caching, and infra.
Show advanced inputs

Only adjust these once your Quick Mode assumptions feel realistic.

Scenario actions

Copy scenario URL

Paste into ChatGPT or Claude, or share with a teammate.

Save and track this scenario

Track pricing drift on this scenario and get an email if the latest result changes.

How tracking works

After you click Save and track, we carry this exact calculator state into the tracked-scenarios page so you can sign in and confirm the save.

We save your assumptions and the pricing snapshot used for this result.

When a newer pricing snapshot lands, we recompute the same scenario, show what changed, and email you if the latest result moved.

1 tracked scenario free, then $12/mo or $120/yr for up to 25 tracked scenarios.

Cost / user / month

$2.3954

Gross margin

95.1%

Estimated monthly AI cost

$526.98

Estimated monthly gross profit

$10,253.02

Top Cost Drivers

Most sensitive variables when each is moved up by 10%.
Requests Per User Month10.0%
Rerank Docs6.4%
Cache Hit Rate-2.8%

Totals

Summary metrics for monthly unit economics and margin.
Cost per request
$0.02178
Cost per user/month
$2.3954
Gross margin %
95.1%
Break-even price
$2.3954
Cost per request$0.02178
Cost per user/month$2.3954
Gross margin %95.1%
Break-even price$2.3954

Component Breakdown (USD/user/month)

Each cost component is computed independently and summed.

Largest cost block: reranking, not generation.

GenerationModel input/output token spend for requests.
$0.7854
RetrievalExtra model input spend from retrieved context chunks.
$0.2541
RerankingReranker cost based on docs scored per request.
$1.98
Embeddings IngestionAmortized per-user share of the fixed monthly corpus embedding refresh cost.
$0
Vector DbVector database query cost across all requests.
$0.002
CacheSavings from cache hits. Negative means lower total cost.
$-0.6756
InfraNon-model infra overhead per request.
$0.0495
GenerationModel input/output token spend for requests.$0.7854
RetrievalExtra model input spend from retrieved context chunks.$0.2541
RerankingReranker cost based on docs scored per request.$1.98
Embeddings IngestionAmortized per-user share of the fixed monthly corpus embedding refresh cost.$0
Vector DbVector database query cost across all requests.$0.002
CacheSavings from cache hits. Negative means lower total cost.$-0.6756
InfraNon-model infra overhead per request.$0.0495
Sensitivity RankingChange in total cost when one variable is increased by 10%.
VariableDelta cost %
Requests Per User MonthUser activity level per month.10.0%
Rerank DocsDocs reranked per request.6.4%
Cache Hit RateFraction of requests served by cache.-2.8%
Output TokensGenerated tokens per request.1.8%
Retrieved ChunksRetrieved chunk count per request.0.8%
Tokens Per ChunkAverage chunk size in tokens.0.8%
Input TokensPrompt-side tokens per request.0.8%
Vector Queries Per RequestVector query count per request.0.0%
Monthly Active UsersActive-user estimate used to amortize fixed monthly embedding refresh.-0.0%

Assumptions and Units

Explicit assumptions to keep outputs reproducible and auditable.
  • CurrencyUSD
  • Token unittoken
  • Pricing snapshot2026-04-12
  • Selected model rowOpenAI/GPT-5.3 Codex
  • Volume basisBusiness totals and fixed monthly terms use monthly active users as the denominator
  • Embedding refreshAmortized per user from the fixed monthly corpus refresh term
  • Cache componentNegative value means cost savings

Recommended Next Step

Use these links to lower top cost drivers without guessing.

Optimize the biggest modeled cost driver first. Compare infra only after model, retrieval, reranking, or context changes stop being the better lever.

Sources and Snapshot

Pricing comes from the current dated snapshot.

Active Pricing Row

Selected model

OpenAI / GPT-5.3 Codex

  • Input tokens$1.75 / 1M
  • Output tokens$14 / 1M

Shared retrieval defaults

  • Embedding input$0.02 / 1M
  • Rerank docs$1 / 1K