ML//benchmark//MATH

- Competition-level math problems from AMC and AIME contests.


Competition-level math problems from AMC and AIME contests.

Frontier variants: MATH level 5 (hardest subset), FrontierMath (unsolved research-level problems)

The benchmark that proved chain of thought and reasoning models actually help — dramatic accuracy jumps.