ML//neural network//loss landscape//saddle point

A point in the loss landscape where the gradient is zero but the point is neither a minimum nor a maximum. The surface curves downward in some directions and upward in others, like the center of a horse saddle.


A point in the loss landscape where the gradient is zero but the point is neither a minimum nor a maximum. The surface curves downward in some directions and upward in others, like the center of a horse saddle.

In high-dimensional spaces, saddle points vastly outnumber true local minima. A point is a local minimum only if curvature is positive in every dimension; a saddle point only needs one negative curvature direction. With millions of dimensions, the probability that all curvatures happen to be positive is vanishingly small.

Mathematically, saddle points are identified by the Hessian matrix having a mix of positive and negative eigenvalues. Modern optimizers like Adam handle saddle points better than vanilla SGD because their momentum terms carry them through flat regions where the gradient is near zero.