ML//RAG//TF-IDF

- Term Frequency–Inverse Document Frequency: classic document ranking from 1972.


Term Frequency–Inverse Document Frequency: classic document ranking from 1972.

TF: how often a term appears in a document. IDF: how rare the term is across all documents. Product = score.

Still a strong baseline for keyword search — BM25 is essentially TF-IDF with better saturation and length normalization.

Part of the "boring IR" stack (TF-IDF, BM25, inverted indices) that production RAG still depends on.