Neural Networks: Hessians

Neural Networks: Hessians -ShubhamShukla

Hessians? Like I Care!

Hessians! Any use? Uses of Hessians: Determination of the kinetic constants of decomposition reactions. Click here for more on this. Edge detection in DIP – involves abrupt change of Gray levels. Object Recognition – Robot Vision.

Hessians – Machine Learning? Non Linear optimization algos for training use 2nd order derivatives of Error Function. Good for retraining a FFNN with slight change in training data. Laplace approximations for Bayesian Neural Nets. In Network ‘Pruning’ algorithms.

Why approximate Hessians? No. Of parameter : W (weights and bias) For each pattern: O (W2). Approximations provide easy way to reduce complexity to O (W). Decently fair enough estimate for H(x) for a particular domain.

Diagonal Approximation Some applications of Hessians require inv(H) Good Approximation: Replace off-diagonal elements to zero. RHS can be recursively found as:

Diagonal Approximation(2) Neglecting diagonal elements we get: This is of order O(W) WRT the original order of Hessian which is O(W2). Problem: Typically Hessians are strongly non-diagonal.

Outer Product Approximation Good for regression problems. Uses sum of square error functions. Hessian Matrix:

Outer Product Approximation(2) Eliminate 2nd order differential term on RHS. For trained System: yn = t n So, 2nd derivative vanishes. In general, (From 1.5.5): (yn)opt = avg[E(t|x)] So, 2nd derivative is eliminated either ways. Levenburg – Marquardt approximation:

Inverse Hessians Outer Approximation: Sequential approach to building up Hessian: Woodbury Identity:

Inverse Hessians Put: HL = M and v = b: Hence, sequential procedure continues till (L+1) = N. Initialization with: H0 = αI.

When Perfection Matters!  Exact evaluation of Hessian by extending Back-prop approach to evaluate first order derivatives. Consider network with 2 layers of weight. We define:

When Perfection Matters!  (2) Both weight in second layer: Both weights in first layer:

When perfection Matters!  (3) One weight in each layer:

Over to Mamatha…

Neural Networks: Hessians

Neural Networks: Hessians

Presentation Transcript

Neural Networks

Neural Networks

Neural Networks

Neural Networks

Neural Networks

NEURAL NETWORKS

Neural Networks

Neural Networks

Neural Networks

Neural Networks

Neural Networks

Neural Networks

Neural Networks

Neural networks

Hessians

Neural Networks

Neural Networks

Neural Networks

Neural Networks

Neural Networks

Neural Networks

Neural Networks