1 / 28

Optimization

Optimization. 吳育德. Unconstrained Minimization. Def : f(x), x is said to be differentiable at a point x*, if it is defined in a neighborhood N around x* and if x* +h a vector n independent of h that where the vector a is called the gradient of f(x) evaluated at x*,

alayna
Download Presentation

Optimization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Optimization 吳育德

  2. Unconstrained Minimization Def : f(x), x is said to be differentiable at a point x*, if it is defined in a neighborhood N around x* and if x* +h a vector n independent of h that where the vector a is called the gradient of f(x) evaluated at x*, denote it as The term <a,h> is called the 1-st variation. and

  3. Unconstrained Minimization Note if f(x) is twice differentiable, then where F(x) is an n*n symmetric, called the Hessian of f(x) Then 1st variation 2nd variation

  4. Directional derivatives Let w be a directional vector of unit norm || w|| =1 Now consider is a function of the scalar r. Def : The directional derivative of f(x) in the direction w (unit norm) at w* is defined as

  5. Directional derivatives Example : Let Then i.e. the partial derivative of f(x*) w.r.t xi is the directional derivative of f(x) in the direction ei. Interpretation of Consider Then The directional derivative along a direction w (||w||=1) is the length of the projection vector of on w.

  6. Unconstrained Minimization [Q] : What direction w yield the largest directional derivative? Ans : Recall that the 1st variation of is Conclusion 1 : The direction of the gradient is the direction that yields the largest change (1st -variation) in the function. This suggests in the steepest decent method which will be described later

  7. Directional derivatives Example: Sol : Let , w with unit norm =

  8. Directional derivatives The directional derivative in the direction of the gradient is Notes :

  9. Directional derivatives Def : f(x) is said to have a local (or relative) minimum at x*, if in a nbd N of x* Theorem: Let f(x) be differentiable ,If f(x) has a local minimum at x* , then pf : Note: is a necessary condition, not sufficient condition.

  10. Directional derivatives Theorem: If f(x) is twice diff and pf : Conclusion2: The necessary & Sufficient Conditions for a local minimum of f(x) is

  11. Minimization of Unconstrained function Prob. : Let y=f(x) , . We want to generate a sequence and such that it converges to the minimum of f(x). Consider the kth guess, , we can generate provided that we have two of information (1) the direction to go (2) a scalar step size Then Basic descent methods (1) Steepest descent (2) Newton-Raphson method

  12. Steepest Descent Steepest descent : Note 1.a. Optimum it minimizes

  13. Steepest Descent Example :

  14. Steepest Descent Example :

  15. Steepest Descent Optimum iteration Remark : The optimal steepest descent step size can be determined analytically for quadratic function.

  16. Steepest Descent 1.b. other possibilities for choosing • Constant step size i.e. • adv : simple • disadv : no idea of which value of α to choose • If α is too large diverge • If α is too small very slow • Variable step size

  17. Steepest Descent 1.b. other possibilities for choosing • Polynomial fit methods • (i)Quadratic fit • gauss three values for α, say α1 , α2 , α3. • Let • Solve for a, b, c minimize by • Check

  18. Steepest Descent 1.b. other possibilities for choosing • Polynomial fit methods • (ii)Cubic fit

  19. Steepest Descent 1.b. other possibilities for choosing • Region elimination methods • Assume g(α) is convex • over [a,b] i.e. one minimum • (a) g1>g2 (b)g1<g2 (c)g1=g2 eliminated eliminated eliminated eliminated initial interval of uncertainty [a,b] , next interval of uncertainty for (i) is [ ,b]; for (ii) is [a, ]; for (iii) is [ , ]

  20. Steepest Descent [Q] : how do we choose and ? (i) Two points equal interval search i.e. α1- a = α1- α2=b- α1 1st iteration 2nd iteration 3rd iteration kth iteration

  21. k=0 Steepest Descent [Q] : how do we choose and ? (ii) Fibonacci Search method For N-search iteration Example: Let N=5, initial a = 0 , b = 1

  22. Steepest Descent [Q] : how do we choose and ? (iii) Golden Section Method then use until Example: then then etc…

  23. Steepest Descent Flow chart of steepest descent Initial guess x(0) Stop! x(k) is minimum Compute ▽f(x(k)) Yes ∥ ▽f(x(k)) ∥﹤ε k=k+1 No α {α1,…αn} Polynomial fit : cubic ,… Region elimination : … Determine α(k) x(k+1)c=x(k)- α(k) ▽f(x(k))

  24. Steepest Descent [Q]: is the direction of the “best” direction to go? suppose the initial guess is x(0) Consider the next guess What should M be such that x(1) is the minimum, i.e. ? Since we want If MQ=I,or M=Q-1 Thus,for a quadratic function,x(k+1)=x(k)-Q-1▽f(x(k)) will take us to the minimum in one iteration no matter what x(0) is.

  25. Newton-Raphson Method Minimize f(x) The necessary condition ▽f(x)=0 The N-R algorithm is to find the roots of ▽f(x)=0 Guess x(k),then x(k+1) must satisfy Note not always converge

  26. Newton-Raphson Method A more formal derivation Min f(x(k)+h) w.r.t h

  27. Newton-Raphson Method Remarks: (1)computation of [F(x(k))]-1 at every iteration → time consuming → modify N-R algorithm to calculate [F(x(k))]-1 every M-th iteration (2)must check F(x(k)) is p.d. at every iteration. If not → Example :

  28. Newton-Raphson Method The minimum of f(x) is at (0,0) In the nbd of (0,0) is p.d. Now suppose we start an initial guess Then diverges. Remark: (3)N-R algorithm is good(fast) when initial guess close to minimum ,but not very good when far from minimum.

More Related