ML//benchmark//MMLU

2023-07-12

- Massive Multitask Language Understanding: 57 subjects from elementary math to professional law.

Massive Multitask Language Understanding: 57 subjects from elementary math to professional law.

Multiple choice. GPT-4: ~86%, approaching human expert level.

Heavy contamination concerns: some questions appear verbatim in training data.