ML//neural network//activation function//GELU

- Gaussian Error Linear Unit — smooth approximation to ReLU


Gaussian Error Linear Unit — smooth approximation to ReLU

x * Φ(x) where Φ is the Gaussian CDF.

Used in Transformers, BERT, GPT — slightly better than ReLU for language tasks.