ML//Inference//Sampling//top-p
- Nucleus sampling: include tokens until cumulative probability reaches p (e.g. 0.95)
Nucleus sampling: include tokens until cumulative probability reaches p (e.g. 0.95)
Adaptive — for confident predictions considers few tokens, for uncertain ones many.
Generally preferred over top-k because it adapts to distribution shape.