ML//RAG//chunking strategy
- How you split documents for embedding and retrieval.
How you split documents for embedding and retrieval.
Fixed-size chunks miss semantic boundaries. Recursive splitting (paragraphs → sentences) is better.
Overlap between chunks prevents losing context at edges.
Chunk size trades precision vs recall — small chunks match better but lose surrounding context.