ML//neural network//optimizer

2017-12-03

How to update weights after computing gradients.

How to update weights after computing gradients.

SGD: multiply gradient by learning rate, subtract from weight. Simple but slow.

Momentum: accumulate past gradients, like a ball rolling downhill.

Adam: adaptive learning rates per parameter, the default choice.