ML//benchmark//IFEval
- Instruction Following Evaluation — tests precise formatting, length, and structural constraint compliance.
Instruction Following Evaluation — tests precise formatting, length, and structural constraint compliance.
Not about knowledge or reasoning — purely about doing exactly what was asked.
Key because real-world LLM usage is mostly instruction following. Extended by Multi-IF and MultiChallenge.