190 likes | 447 Views
Computing Gradient Vector and Jacobian Matrix in Arbitrarily Connected Neural Networks. Author : Bogdan M. Wilamowski , Fellow, IEEE, Nicholas J. Cotton, Okyay Kaynak , Fellow, IEEE, and Günhan Dündar Source : IEEE INDUSTRIAL ELECTRONICS MAGAZINE Date : 2012/3/28 Presenter : 林哲緯.
E N D
Computing Gradient Vector and Jacobian Matrix inArbitrarily Connected Neural Networks Author : Bogdan M. Wilamowski, Fellow, IEEE, Nicholas J. Cotton, OkyayKaynak, Fellow, IEEE, and GünhanDündar Source : IEEE INDUSTRIAL ELECTRONICS MAGAZINE Date : 2012/3/28 Presenter: 林哲緯
Outline • Numerical Analysis Method • Neuron Network Architectures • NBN Algorithm
Minimization problem Newton's method
Minimization problem Steepest descent method http://www.nd.com/NSBook/NEURAL%20AND%20ADAPTIVE%20SYSTEMS14_Adaptive_Linear_Systems.html
Least square problem Gauss–Newton algorithm http://en.wikipedia.org/wiki/Gauss%E2%80%93Newton_algorithm
Levenberg–Marquardt algorithm • Levenberg–Marquardt algorithm • Combine the advantages of Gauss–Newton algorithm and Steepest descent method • far off the minimum like Steepest descent method • Close to the minimum like Newton algorithm • It’s find local minimumnot global minimum
Levenberg–Marquardt algorithm • Advantage • Linear • First-order differential • Disadvantage • inverting is not used at all
Outline • Numerical Analysis Method • Neuron Network Architectures • NBN Algorithm
Weight updating rule First-order algorithm Second-order algorithm MLP FCN ACN α : learning constant g : gradient vector J : Jacobian matrix μ : learning parameter I : identity matrix e : error vector
Forward & Backward Computation Forward : 12345, 21345, 12435, or 21435 Backward : 54321, 54312, 53421, or 53412
Jacobian matrix Row : pattern(input)*output Column : weight p = input number no = output number Row = 2*1 = 2 Column = 8 Jacobin size = 2*8
Outline • Numerical Analysis Method • Neuron Network Architectures • NBN Algorithm
Direct Computation of Quasi-Hessian Matrix and Gradient Vector
Conclusion • memory requirement for quasi-Hessian matrix and gradient vector computation is decreased by(P × M) times • can be used arbitrarily connected neural networks • two procedures • Backpropagation process(single output) • Without backpropagation process(multiple outputs)