ML//Inference//Sampling//top-k
- Only consider the k most probable next tokens, zero out the rest.
Only consider the k most probable next tokens, zero out the rest.
k=1 is greedy, k=50 is common. Cuts the long tail of unlikely tokens.
- Only consider the k most probable next tokens, zero out the rest.
Only consider the k most probable next tokens, zero out the rest.
k=1 is greedy, k=50 is common. Cuts the long tail of unlikely tokens.