ML//RAG//TF-IDF
- Term Frequency–Inverse Document Frequency: classic document ranking from 1972.
Term Frequency–Inverse Document Frequency: classic document ranking from 1972.
TF: how often a term appears in a document. IDF: how rare the term is across all documents. Product = score.
Still a strong baseline for keyword search — BM25 is essentially TF-IDF with better saturation and length normalization.
Part of the "boring IR" stack (TF-IDF, BM25, inverted indices) that production RAG still depends on.