How To Choose Chunk Size and Chunk Count
Chunk size and chunk count directly affect retrieval token overhead and answer quality in RAG systems and agent workflows.
Question
How do chunk size and chunk count affect agent retrieval cost?
Quick answer
Formula: retrieval_tokens_per_request = retrieved_chunks * tokens_per_chunk
- Assumption: higher retrieval tokens increase input token spend per request.
- Assumption: chunk changes should be validated against recall quality.
- Assumption: compare p50 and p90 retrieval patterns, not one average only.
Example: 6 chunks at 220 tokens add 1,320 retrieval tokens per request.
Starting Heuristics
- Chunk size: often 150-350 tokens as a baseline.
- Chunks per request: often 3-8 for many product workflows.
- Increase chunks only when recall failures are visible in evaluations.
Back to calculators: Retrieval Cost, AI Workflow Cost