250 likes | 423 Views
An overview of the SQP and rSQP methodologies. Kedar Kulkarni Advisor: Prof. Andreas A. Linninger Laboratory for Product and Process Design, Department of Bioengineering, University of Illinois, Chicago, IL 60607, U.S.A. Outline of the talk.
E N D
An overview of the SQP and rSQP methodologies Kedar Kulkarni Advisor: Prof. Andreas A. Linninger Laboratory for Product and Process Design, Department of Bioengineering, University of Illinois, Chicago, IL 60607, U.S.A.
Outline of the talk • Introduction to the Quadratic Programming (QP) problem • Properties of the QP problem • Solution methods of the QP problem • Successive Quadratic Programming (SQP) as a method to solve a general Nonlinear Programming (NLP) problem - Introduction to SQP (Derivation) - Possible modification - Case study/Visualization • The rSQP as an improvement over the SQP - Introduction to SQP (Derivation) - Case study • Recap
Introduction to the QP problem: General Constrained Optimization problem: Sufficient condition for local optimum to be global: f, g convex and h linear • Quadratic Programming (QP) problem: • Quadratic Objective function (f) • Linear g and h Standard form: x is a vector of n variables containing the slack variables
Properties of the QP problem: x2 x2 x1 x1 x2 x1 Q is positive/negative semidefinite (i = 0 for some i): Ridge of stationary points (minima/maxima) Q is positive/negative definite (i > 0 or i < 0 i): Stationary points are minima/maxima Q is indefinite (i > 0 for some i and i < 0 for others): Stationary point is a saddle point
Solution methods for the QP problem: Construct the Lagrangian for the QP problem: Write the Kuhn-Tucker (KKT) conditions for the QP problem: • This is a linear set of equations except for the last one • It could be solved using “LP machinery” by a modified Simplex method • (Wolfe, 1959 ), which works only if Q is positive definite • Other methods: (1) Complementary pivoting (Lemke) – faster and works for • positive semidefinite Q. • (2) Range and null space methods (Gill and Murray)
Introduction to SQP: • Solves a sequence of QP approximations to a NLP problem • The objective is a quadratic approximation to the Lagrangian function • The algorithm is simply Newton’s method applied to solve the set of • equations obtained on applying KKT conditions! Consider the general constrained optimization problem again: KKT
Introduction to SQP: • Considering this as a system of equations in x*,* and *, we write the • following Newton step • These equations are the KKT conditions of the following optimization • problem! • This is a QP problem. Its solution is determined by the properties of the Hessian-of-the-Lagrangian
Introduction to SQP: • Equivalent and more general form: • The Hessian is not always positive definite => Non convex QP, difficult • to solve • Remedy: At each iteration approximate the Hessian with a matrix that • is symmetric and positive definite • This is a Quasi-Newton secant approximation • Bi+1 as a function of Bi is given by the BFGS update
BFGS update: • s and y are known; to determine Bi+1. • Too many solutions possible. Obtain Bi+1as a result of an optimization • problem -- Positive definite and symmetric B Thus, Broyden Finally, -- BFGS update • If Biis positive definite and sTy > 0, then Bi+1 is also positive definite
Possible modification: • Choose to ensure progress towards optimum: • is chosen by making sure that a merit function is decreased at each • iteration: exact penalty function, augmented Lagrangian Exact penalty function: • Newton-like convergence properties of SQP: • - Fast local convergence • - Trust region adaptations provide a stronger guarantee of global • convergence
Case study: Choose x0 = [0,0]T as the initial guess k = 0: Line search
Visualization: Feasible region 1st Iteration k = 1:
Visualization: k = 2: Line Solution search so, no search direction remaining!
SQP: A few comments • State-of-the-art in NLP solvers, requires fewest function iterations • Does not require feasible points at intermediate iterations • It is sensitive to scaling of functions and variables. Performs poorly with ill-conditioned QP problems • Not efficient for problems with a large number of variables (n > 100) • Computational time for each iteration goes up due to presence of dense matrices • Reduced space methods (rSQP, MINOS), large scale adaptations of SQP
Introduction to rSQP: Consider the general constrained optimization problem again: SQP iteration i KKT conditions z = [xT sT] The second row is: n > m; (n-m) free, m dependent • To solve this system of equations we could exploit the properties of the null space of the matrix A. Partition A as follows: Where N is n X (n-m) and C is m X m • Z is a member of the null space: • Z can be written in terms of N and C as: Check AZ = 0 !!
Introduction to rSQP: • Now choose Y such that [Y | Z] is a non-singular and well-conditioned matrix -- co-ordinate basis • It remains to find dY and dZ. Let d = YdY + ZdZ in the optimality condition of the QP Where • The last row could be used to solve for dY: • This value could be used to solve for dZ using the second row: • This is okay if there were no bounds on z. If there are bounds too, then:
Case study: • At iteration i consider the following QP sub-problem: n=3, m=2 • Comparing with the standard form: Choose
Case study: • Thus Z can be evaluated as: Check AZ = 0 !! • Now choose Y as: • Rewrite the last row : where • Solve for dY : • Now, we have Y, Z, dY. To calculate dZ
Case study: • Solve: • The components: • Finally:
rSQP: A few comments • Basically, solve for dY and dZ separately instead of directly solving for d • More iterations but less time per iteration • The full Hessian does not need to be evaluated. We deal only with the reduced (projected) Hessian ZTBZ ((n-m) X (n-m)) • Local convergence properties are similar for both SQP and rSQP Recap: Newton’s Method f**(x)=0 KKT Optimality condition Quadratic approx. to Lagrangian P1 P2 P3 P4 NLP f(x)=0 f*(x)=0 KKT P5 P6 Range and null space rSQP sub problem QP sub problem