ML//RAG//chunking strategy

- How you split documents for embedding and retrieval.


How you split documents for embedding and retrieval.

Fixed-size chunks miss semantic boundaries. Recursive splitting (paragraphs → sentences) is better.

Overlap between chunks prevents losing context at edges.

Chunk size trades precision vs recall — small chunks match better but lose surrounding context.