ML//Evaluation//cross-validation

Estimating out-of-sample performance by rotating which fold is held out, then averaging.


Estimating out-of-sample performance by rotating which fold is held out, then averaging.

Grouped (session-level) CV: a whole group goes entirely into train or test, never both, to avoid leakage from near-duplicate rows.

Stratify the folds so rare positives do not all pile into one.

The effective sample size is the number of independent groups, not the number of rows.