1 / 30

Computational Optimization

Computational Optimization. Steepest Descent 2/8. Most Obvious Algorithm. The negative gradient always points downhill. Head in the direction of the negative gradient. Use linesearch to decided how far to go. Repeat. Steepest Descent Algorithm. Start with x 0 For k =1,…,K

brittanip
Download Presentation

Computational Optimization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Computational Optimization Steepest Descent 2/8

  2. Most Obvious Algorithm • The negative gradient always points downhill. • Head in the direction of the negative gradient. • Use linesearch to decided how far to go. • Repeat.

  3. Steepest Descent Algorithm • Start with x0 • For k =1,…,K • If xk is optimal then stop • Perform exact or backtracking linesearch to determine xk+1=xk+kpk

  4. How far should we step? Linesearch: xk+1=xk+kpk • fixed k::=c>0 a constant • Decreasing • Exact • Approximate conditions • Armijo • Wolfe • Goldstein

  5. Key Idea: Sufficient Decrease The step can’t go to 0 unless the gradient goes to 0.

  6. Armijo Condition 1 1/2  g(0)+ c1g’() g() g(0)+  g’()

  7. Curvature Condition Make sure the step uses up most of the available decrease

  8. Wolfe Condition For 0<c1<c2<1 Solution exists for any descent direction if f is bounded below on the linesearch. (Lemma 3.1)

  9. Backtracking Search • Key point: Stepsize cannot be allowed to go to 0 unless gradient is going to 0. Must have sufficient decrease. • Fix >0 (0,1)  (0,1)

  10. Armijo Condition 1 1/2  g(0)+ c1g’() g() g(0)+  g’()

  11. Steepest Descent Algorithm • Start with x0 • For k =1,…,K • If xk is optimal then stop • Perform backtracking linesearch to determine xk+1=xk+kpk

  12. Convergence Analysis • We usually analyze quadratic problem • Equivalent to more general problems since let x* solve min f(x)=1/2x’Qx-bx Let y = x-x* so x=y+x* then

  13. Convergence Rate • Consider case of constant stepsize • Square both sides

  14. Convergence Rate…

  15. Best Step

  16. Condition number determines convergence rate • Linear convergence since • Condition number: 1 is best

  17. Well Conditioned Problem x^2+y^2 Cond num =1/1=1 Steepest Descent does one step like Newton

  18. ill-Conditioned Problem 50(x-10)^2+y^2 Cond num =50/1=50 Steepest Descent ZIGZAGS!!!

  19. Linear Convergencefor our fixed stepsize

  20. Theorem 3.3 NW Let be the sequence generated by steepest descent with exact linesearch applied to function min 1/2x’Qx-b’x where Q is p.d. Then for any x0, the sequence converges to the unique minimizer x* and That is the method converges linearly

  21. Condition Number Officially: But this as ratio of max and mix eigenvalues if A is pd and symmetric

  22. Theorem 3.4 • Suppose f is twice cont diff and the sequence of steepest descent converge to x* satisfying SOSC. Let The for all k sufficient large

  23. Other possible directions • Any direction satisfying is a descent direction. Which ones will work?

  24. Corollary: Zoutendijk’s Theorem For any descent direction and step size satisfying the Wolfe conditions, f bounded, below, Lipschitz, and differentiable

  25. Convergence Theorem If for all iteration k Then So steepest descent converges! Many other variations.

  26. Theorem Convergence of Gradient Related Methods (i) Assume the set S is bounded where (ii) Let be Lipschitz cont on S i.e. for all x,y S and some fixed finite L For the sequence generated by

  27. Theorem 10.2 Nash and Sofer (iii) Let the directions pk satisfy: (iv) be gradient related: and bounded in form:

  28. Theorem 10.2 • Let the stepsize be the first element of the sequence {1,1/2,1/4,1/8,….} satisfying with 0<<1. Then

  29. Steepest Descent • An obvious choice of direction is • Called steepest descent because • Clearly satisfies gradient related requirement of Theorem 10.2

  30. Steepest Descent Summary • Simple algorithm • Inexpensive per iteration • Only requires first derivative information • Global convergence to local minimum • Linear convergence – may be slow • Convergence rate depends on condition number • May zigzag

More Related