1 / 12

Optimization Methods

Optimization Methods. Unconstrained optimization of an objective function F Deterministic, gradient-based methods Running a PDE: will cover later in course Gradient-based (ascent/descent) methods Stochastic methods Simulated annealing Theoretically but not practically interesting

miriam
Download Presentation

Optimization Methods

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Optimization Methods • Unconstrained optimization of an objective function F • Deterministic, gradient-based methods • Running a PDE: will cover later in course • Gradient-based (ascent/descent) methods • Stochastic methods • Simulated annealing • Theoretically but not practically interesting • Evolutionary (genetic) algorithms • Multiscale methods • Mean field annealing, graduated nonconvexity, etc. • Constrained optimization • Lagrange multipliers

  2. Our Assumptions for Optimization Methods • With objective function F(p) • Dimension(p) >> 1and frequently quite large • Evaluating F at any p is very expensive • Evaluating D1F at any p is very, very expensive • Evaluating D2F at any p is extremely expensive • True in most image analysis and graphics applications

  3. Order of Convergencefor Iterative Methods • |ei+1| = k| ei|a in limit • a is order of convergence • The major factor in speed of convergence • N steps of method has order of convergence aN • Thus issue is linear convergence (a=1) vs. superlinear convergence(a>1)

  4. Ascent/Descent Methods • At maximum, D1F (i.e., F) =0. • Pick direction of ascent/descent • Find approximate maximum in that direction: two possibilities • Calculate stepsize that will approximately reach maximum • In search direction, find actual max within some range

  5. Gradient Ascent/Descent Methods • Direction of ascent/descent is D1F. • If you move to optimum in that direction, next direction will be orthogonal to this one • Guarantees zigzag • Bad behavior for narrow ridges (valleys) of F • Linear convergence

  6. Newton and Secant Ascent/Descent Methods for F(p) • We are solving D1F=0 • Use Newton or secant equation solution method to solve • Newton to solve f(p)=0 is pi+1 = pi – D1f(pi)-1pi • Newton • Move from p to p-(D2F)-1D1F • Is direction of ascent/descent is gradient direction D1F? • Methods that ascend/descend in D1f (gradient) directionare inferior • Really direction of ascent/descent is direction of (D2F)-1D1F • Also gives you step size in that direction • Secant • Same as Newton except replace D2F and D1F by discrete approximations to them from this and last n iterates

  7. Conjugate gradient method • Preferable to gradient descent/ascent methods • Two major aspects • Successive directions for descent/ascent are conjugate: <hi+1,D2Fhi> = 0 in limit for convex F • If true at all steps (quadratic F), convergence in n-1 steps, with n=dim(p) Improvements available using more previous directions • In search direction, find actual max/min within some range • Quadratic convergence depends on <D1F(xi), hi> =0, i.e., F a local minimum in the hi direction • References • Shewchuk, An Intro. to the CGM w/o the Agonizing Pain (http://www-2.cs.cmu.edu/~quake-papers/painless-conjugate-gradient.pdf) • Numerical Recipes • Polak, Computational Methods in Optimization, Ac. Press

  8. Conjugate gradient method issues • Preferable to gradient descent/ascent methods • Must find a local minimum in the search direction • Will have trouble with • Bumpy objective functions • Extremely elongated minimum/maximum regions

  9. Multiscale Gradient-Based OptimizationTo avoid local optima • Smooth objective function to put initial estimate on hillside of its global optimum • E.g., by using larger scale measurements • Find its optimum • Iterate • Decrease scale of objective function • Use prev. optimum as starting point for new optimization

  10. Multiscale Gradient-Based OptimizationExample Methods • General methods • Graduated non-convexity • [Blake & Zisserman, 1987] • Mean field annealing • [Bilbro, Snyder, et al, 1992] • In image analysis • Vary degree of globality of geometric representation

  11. Optimization under Constraints by Lagrange Multiplier(s) • To optimize F(p) over p subject to gi(p)=0, i=1, 2, …, N, with p having n parameters • Create function F(p)+i li gi(p) • Find critical point for it over p and l • Solve D1p,l[F(p)+i li gi(p)]=0 • n+N equations in n+N unknowns • N of the equations are just gi(p)=0, i=1, 2, …, N • The critical point will need to be an optimum w.r.t. p

  12. Stochastic Methods • Needed when objective function is bumpy or many variables or hard to compute gradient of objective function

More Related