ML//neural network//activation function//GELU
- Gaussian Error Linear Unit — smooth approximation to ReLU
Gaussian Error Linear Unit — smooth approximation to ReLU
x * Φ(x) where Φ is the Gaussian CDF.
Used in Transformers, BERT, GPT — slightly better than ReLU for language tasks.