ML//Training//dataset//data contamination

- When benchmark test data leaks into training data.


When benchmark test data leaks into training data.

The model memorized the answers instead of learning the skill.

Hard to detect at scale — how much of GPT-4's MMLU score is memorization?