1 / 47

Chapter 2-OPTIMIZATION

Chapter 2-OPTIMIZATION. G.Anuradha. Contents. Derivative-based Optimization Descent Methods The Method of Steepest Descent Classical Newton’s Method Step Size Determination Derivative-free Optimization Genetic Algorithms Simulated Annealing Random Search Downhill Simplex Search.

Download Presentation

Chapter 2-OPTIMIZATION

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 2-OPTIMIZATION G.Anuradha

  2. Contents • Derivative-based Optimization • Descent Methods • The Method of Steepest Descent • Classical Newton’s Method • Step Size Determination • Derivative-free Optimization • Genetic Algorithms • Simulated Annealing • Random Search • Downhill Simplex Search

  3. What is Optimization? • Choosing the best element from some set of available alternatives • Solving problems in which one seeks to minimize or maximize a real function

  4. Notation of Optimization Optimize y=f(x1,x2….xn) --------------------------------1 subject to gj(x1,x2…xn) ≤ / ≥ /= bj ----------------------2 where j=1,2,….n Eqn:1 is objective function Eqn:2 a set of constraints imposed on the solution. x1,x2…xn are the set of decision variables Note:- The problem is either to maximize or minimize the value of objective function.

  5. Complicating factors in optimization • Existence of multiple decision variables • Complex nature of the relationships between the decision variables and the associated income • Existence of one or more complex constraints on the decision variables

  6. Types of optimization • Constraint:- Solution is arrived at by maximizing or minimizing the objective function • Unconstraint:- No constraints are imposed on the decision variables and differential calculus can be used to analyze them Examples

  7. Least Square Methods for System Identification • System Identification:- Determining a mathematical model for an unknown system by observing the input-output data pairs • System identification is required • To predict a system behavior • To explain the interactions and relationship between inputs and outputs • To design a controller • System identification • Structure identification • Parameter identification

  8. Structure identification • Apply a priori knowledge about the target system to determine a class of models within which the search for the most suitable model is conducted • y=f(u;θ) y – model’s output u – Input Vector θ – parameter vector

  9. Parameter Identification • Structure of the model is known and optimization techniques are applied to determine the parameter vector θ= θ

  10. Block diagram of parameter identification

  11. Parameter identification • An input ui is applied to both the system and the model • Difference between the target system’s output yi and model’s output yi is used to update a parameter vector θ to minimize the difference • System identification is not a one-pass process; it needs to do both structure and parameter identification repeatedly

  12. Classification of Optimization algorithms • Derivative-based algorithms:- • Derivative-free algorithms

  13. Characteristics of derivative free algorithm • Derivative freeness:- repeated evaluation of objective function • Intuitive guidelines:- concepts are based on nature’s wisdom, such as evolution and thermodynamics • Slower • Flexibility • Randomness:- global optimizers • Analytic Opacity:-knowledge about them are based on empirical studies • Iterative nature:-

  14. Characteristics of derivative free algorithm • Stopping condition of iteration:- let k denote an iteration count and fk denote the best objective function obtained at count k. stopping condition depends on • Computation time • Optimization goal; • Minimal Improvement • Minimal relative improvement

  15. Basics of Matrix Manipulation and Calculus

  16. Basics of Matrix Manipulation and Calculus

  17. Gradient of a Scalar Function

  18. Jacobian of a Vector Function

  19. Least Square Estimator • Method of least squares is a standard approach to approximate solution of overdetermined systems. • Least Squares- Overall solution minimizes the sum of the squares of the errors made in solving every single equation • Application—Data Fitting

  20. Types of Least Squares • Least Squares • Linear:- It is a linear combination of parameters. • The model may represent a straight line, a parabola or any other linear combination of functions • Non-Linear:- the parameters appear as functions, such as β2,eβx.If the derivatives are either constant or depend only on the values of the independent variable, the model is linear else non-linear.

  21. Differences between Linear and Non-Linear Least Squares

  22. Linear model Regression Function

  23. Linear model contd… Using matrix notation Where A is a m*n matrix

  24. Due to noise a small amount of error is added

  25. Least Square Estimator

  26. Problem on Least Square Estimator

  27. Derivative Based Optimization • Deals with gradient-based optimization techniques, capable of determining search directions according to an objective function’s derivative information • Used in optimizing non-linear neuro-fuzzy models, • Steepest descent • Conjugate gradient

  28. 1 T T * * ¼ x x x x x x x x x 2 F ( ) = F ( + D ) = F ( ) + Ñ F ( ) D + - - - D Ñ F ( ) D + * * 2 x x x x = = First-Order Optimality Condition For small Dx: If x* is a minimum, this implies: If then But this would imply that x* is not a minimum. Therefore Since this must be true for every Dx,

  29. Second-Order Condition If the first-order condition is satisfied (zero gradient), then A strong minimum will exist at x* if for any Dx°0. Therefore the Hessian matrix must be positive definite. A matrix A is positive definite if: for any z°0. This is a sufficient condition for optimality. A necessary condition is that the Hessian matrix be positive semidefinite. A matrix A is positive semidefinite if: for any z.

  30. Basic Optimization Algorithm or pk - Search Direction ak - Learning Rate

  31. Steepest Descent Choose the next step so that the function decreases: For small changes in x we can approximate F(x): where If we want the function to decrease: We can maximize the decrease by choosing:

  32. Example

  33. Plot

  34. Effect of learning rate • More the learning rate the trajectory becomes oscillatory. • This will make the algorithm unstable • The upper limit for learning rates can be set for quadratic functions

  35. Stable Learning Rates (Quadratic) Stability is determined by the eigenvalues of this matrix. Eigenvalues of [I - aA]. (li - eigenvalue of A) Stability Requirement:

  36. Example

  37. Newton’s Method Take the gradient of this second-order approximation and set it equal to zero to find the stationary point:

  38. Example

  39. Plot

  40. Conjugate Vectors A set of vectors is mutually conjugate with respect to a positive definite Hessian matrix A if One set of conjugate vectors consists of the eigenvectors of A. (The eigenvectors of symmetric matrices are orthogonal.)

  41. For Quadratic Functions The change in the gradient at iteration k is where The conjugacy conditions can be rewritten This does not require knowledge of the Hessian matrix.

  42. Forming Conjugate Directions Choose the initial search direction as the negative of the gradient. Choose subsequent search directions to be conjugate. where or or

  43. Conjugate Gradient algorithm • The first search direction is the negative of the gradient. • Select the learning rate to minimize along the line. (For quadratic functions.)

  44. Example

  45. Example

  46. Plots Conjugate Gradient Steepest Descent

  47. This is used for finding line minimization methods and their stopping criteria • Initial bracketing • Line searches • Newton’s method • Secant method • Sectioning method

More Related