ML//scaling laws//Chinchilla

2022-05-05

- DeepMind (Hoffmann et al., 2022): most LLMs were undertrained relative to their size.

DeepMind (Hoffmann et al., 2022): most LLMs were undertrained relative to their size.

Given a fixed compute budget, use more data with fewer parameters than previously thought.

Chinchilla 70B matched Gopher 280B by training on 4× more tokens.

Shifted the industry from "bigger model" to "more data, right-sized model".