480 likes | 706 Views
Nonlinear Programming Models. In LP ... the objective function & constraints are linear and the problems are “ easy ” to solve. Most real-world problems have nonlinear elements and are hard to solve. General NLP. Minimize f ( x ). s.t. g i ( x ) ( , , =) b i , i = 1,…, m.
E N D
Nonlinear Programming Models In LP ... the objective function & constraints are linear and the problems are “easy”to solve. Most real-world problems have nonlinear elements and are hard to solve.
General NLP Minimize f(x) s.t. gi(x) (, , =) bi, i = 1,…,m x is the n-dimensional vector of decision variables f(x) is the objective function gi(x) are the constraint functions bi are fixed known constants
4 Example 1 Max 3x1 + 2x2 2 s.t. x1 + x2£ 1, x1³ 0, x2 unrestricted … c x c x c x Example 2 Max e e e 1 1 2 2 n n s.t. Ax = b, x³0 n å fj(xj) Example 3 Min Problems with “decreasing efficiencies” j=1 s.t. Ax = b, x³0 fj(xj) where each fj(xj)is of the form xj Examples 2 and 3 can be reformulated as LPs
NLP Graphical Solution Method Max f(x1, x2) = x1x2 s.t. 4x1 + x2£ 8 x1, x2 ³ 0 x2 8 f(x) = 2 f(x) = 1 x1 2 Optimal solution will lie on the line g(x) = 4x1 + x2 – 8 = 0.
Solution Characteristics Gradient of f(x) = f(x1, x2) (f/x1, f/x2)T This gives f/x1 = x2, f/x2 = x1 and g/x1 = 4, g/x2 = 1 At optimality we have f(x1, x2) = g(x1, x2) or x2* = 4 and x1* = 1 • Solution is not a vertex of feasible region. • For this particular problem the solution is on the boundary of the feasible region. • This is not always the case.
Nonconvex Function global max stationary point f(x) local max local min local min x Let S Rn be the set of feasible solutions to an NLP. Definition: A global minimum is any x0S such that f(x0) f(x) for all feasible x not equal to x0.
Function with Unique Global Minimum at x = (1, –3) What is the optimal solution if x1³ 0 and x2³ 0 ?
Function with Multiple Maxima and Minima Min {f(x)= sin(x) : 0 x 5p}
Constrained Function with Unique Global Maximum and Unique Global Minimum
Convex for Univariate f : 2 d ( ) f x ≥ 0 for all x. 2 d x Convexity Convex function: If you draw a straight line between any two points on f(x) the line will be above or on the line of f(x). Concave function: If f(x) is convex than - f(x) is concave. Linear functions are both convex and concave.
1-dimensional example Definition of Convexity Let x1 and x2 be two points in S Rn. A function f(x) is convex if and only if f(lx1 + (1–l)x2) ≤ lf(x1) + (1–l)f(x2) for all 0 < l < 1. It is strictly convex if the inequality sign ≤ is replaced with the sign <.
A positively weighted sum of convex functions is convex: if fk(x) k =1,…,m are convex and 1,…,m³ 0 then f(x) = å akfk(x) is convex. m k=1 … … … Theoretical Result for Convex Functions Hessian of f at x: Example: f(x) = 2x13 + 3x22 – 4x12x2 + 5x1-8
f(x) x1 x2 Determining Convexity Single Dimensional Functions: A function f(x) ÎC1 is convex if and only if it is underestimated by linear extrapolation; i.e., f(x2) ≥ f(x1) + (df(x1)/dx)(x2 – x1) for all x1 and x2. A function f(x) ÎC2 is convex if and only if its second derivative is nonnegative. d2f(x)/dx2 ≥ 0 for all x If the inequality is strict (>), the function is strictly convex.
Example: f(x) = 3(x1)2 + 4(x2)3 – 5x1x2 + 4x1 Multiple Dimensional Functions Definition: The Hessian matrix H(x) associated with f(x) is the nn symmetric matrix of second partial derivatives of f(x) with respect to the components of x. When f(x) is quadratic, H(x) has only constant terms; when f(x) is linear, H(x) does not exist.
Properties of the Hessian How can we use Hessian to determine whether or not f(x) is convex? • H(x) is positive semi-definite (PSD) if and only if xTHx≥ 0 for all x and there exists an x 0 such that xTHx≥ 0. • H(x) is positive definite (PD) if and only if xTHx> 0 for all x0. • H(x) is indefinite if and only if xTHx> 0 for some x, and xTHx< 0 for some other x.
Multiple Dimensional Functions and Convexity • f(x) is convex if only if f(x2) ≥ f(x1) + ÑTf(x1)(x2 – x1) for all x1 and x2. • f(x) is convex (strictly convex) if its associated Hessian matrix H(x) is positive semi-definite (definite) for all x. • f(x) is concave if only if f(x2) ≤ f(x1) + ▽Tf(x1)(x2 – x1) for all x1 and x2. • f(x) is concave (strictly concave) if its associated Hessian matrix H(x) is negative semi-definite (definite) for all x. • f(x) is neither convex nor concave if its associated Hessian matrix H(x) is indefinite
Testing for Definiteness Let Hessian, H = Definition: The ith leading principal submatrix of H is the matrix formed taking the intersection of its first i rows and i columns. Let Hi be the value of the corresponding determinant:
Definition • The kth order principalsubmatrices of an nn symmetric matrix A are the kk matrices obtained by deleting n - k rows and the correspondingn - k columns of A (where k = 1, ... , n). • Example
Rules for Definiteness • H is positive definite if and only if the determinants of all the leading principal submatrices are positive; i.e., Hi> 0 for i = 1,…,n. • His negative definite if and only if H1 < 0 and the remaining leading principal determinants alternate in sign: • H2 > 0, H3 < 0, H4 > 0, . . . • H is positive-semidefinite if and only if all principal • submatrices ( Hi ) have nonnegative determinants. • H is negative semi-definiteness if and only if • Hi 0 for i odd and Hi 0 for i even.
Quadratic Functions Example 1: f(x) = 3x1x2 + x12 + 3x22 so H1 = 2 and H2 = 12 – 9 = 3 Conclusion f(x) is strictly convex because H(x) is positive definite.
Quadratic Functions Example 2: f(x) = 24x1x2 + 9x12 + 6x22 • H1 = 18 and H2 = 576 – 576 = 0 → f is not PD • H is positive semi-definite (determinants of all principal submatrices are nonnegative) →f(x) is convex . • Note, xTHx = 18(x1 + (4/3)x2)2≥ 0.
Nonquadratic Functions Example 3: f(x) = (x2 – x12)2 + (1 – x1)2 Thus the Hessian depends on the point under consideration: At x = (1, 1), which is positive definite. At x = (0, 1), which is indefinite. Thus f(x) is not convex although it is strictly convex near (1, 1).
Example Is matrix A PD or PSD or ND or NSD or Indefinite ?
Convex Sets Definition: A set Sn is convex if any point on the line segment connecting any two points x1, x2ÎS is also in S. Mathematically, this is equivalent to x0 = lx1 + (1–l)x2ÎS for all l such that 0 ≤ l ≤ 1. x1 x2 x1 x1 x2 x2
(Nonconvex) Feasible Region S = {(x1, x2) : (0.5x1 – 0.6)x2 ≤ 1 2(x1)2 + 3(x2)2 ≥ 27; x1, x2 ≥ 0}
Convex Sets and Optimization Let S = { xÎn : gi(x) £ bi, i = 1,…,m } Fact:If gi(x) is a convex function for each i = 1,…,m then S is a convex set. Convex Programming Theorem: Let xn and let f(x) be a convex function defined over a convex constraint set S. If a finite solution exists to the problem Minimize{f(x) : xÎS} then all local optima are global optima. If f(x) is strictly convex, the optimum is unique.
Note • Let s = { xn : g(x) b}. Fact:If g (x) is a convex function, then s is a convex set. • Let S = { xn : gi(x) bi, i = 1,…,m } Fact:If gi(x) is a convex function for each i = 1,…,m then S is a convex set. • Let t = { xn : g(x) b}. Fact:If g (x) is a concave function, then t is a convex set. • Let T = { xn : gi(x) bi, i = 1,…,m } Fact:If gi(x) is a concave function for each i = 1,…,m then T is a convex set.
Convex Programming Min f(x1,…,xn) s.t. gi(x1,…,xn) £ bi i = 1,…,m x1 ³ 0,…,xn ³ 0 is a convex program if fis convex and each gi is convex. Max f(x1,…,xn) s.t. gi(x1,…,xn) £ bi i = 1,…,m x1 ³ 0,…,xn ³ 0 is a convex program if f is concave and each gi is convex.
Linearly Constrained Convex Function with Unique Global Maximum Maximize f(x) = (x1 – 2)2 + (x2 – 2)2 subject to –3x1 – 2x2 ≤ –6 –x1 + x2 ≤ 3 x1 + x2 ≤ 7 2x1 – 3x2 ≤ 4
Importance of Convex Programs Commercial optimization software cannot guarantee that a solution is globally optimal to a nonconvex program. NLP algorithms try to find a point where the gradient of the Lagrangian function is zero – a stationary point – and complementary slackness holds. Given L(x,m) = f(x) + m(g(x) – b) we want L(x,m) = 0, g(x) – b ≤0, m[g(x)-b] = 0, x³ 0, m³ 0 However, for a convex program, all local solutions are globally optima.
Max V(r,h) = pr2h s.t. 2pr2 + 2prh = S r³ 0, h³ 0 r h Example: Cylinder Design We want to build a cylinder (with a top and a bottom) of maximum volume such that its surface area is no more than S units. There are a number of ways to approach this problem. One way is to solve the surface area constraint for h and substitute the result into the objective function.
Solution by Substitution S - 2pr2 S - 2pr2 rS Volume = V = pr2 - pr3 [ ] = h = p 2 2 r 2pr 1/2 dV S S S 1/2 = 0 - r = ( ) , h = r = 2( ) 2pr p p dr 6 6 S 3/2 S S 1/2 1/2 ( ) V = pr2h = 2p r = ( ) ) h = 2( p 6 p p 6 6 Is this a global optimal solution?
Test for Convexity dV(r) S d2V(r) rS = -6pr - 3pr2 = - pr3 V(r) = dr 2 2 dr2 2 d V £ 0 for all r ³ 0 2 dr Thus V(r) is concave on r ³ 0 so the solution is a global maximum.
Advertising (with Diminishing Returns) • A company wants to advertise in two regions. • The marketing department says that if $x1 is spent in region 1, sales volume will be 6(x1)1/2. • If $x2 is spent in region 2 the sales volume will be 4(x2)1/2. • The advertising budget is $100. Model: Max f(x) = 6(x1)1/2 + 4(x2)1/2 s.t. x1 + x2£ 100, x1³ 0, x2³ 0 Solution:x1* = 69.2, x2* = 30.8, f(x*) = 72.1 Is this a global optimum?
Portfolio Selection with Risky Assets (Markowitz) • Suppose that we may invest in (up to) n stocks. • Investors worry about (1) expected gain (2) risk. Let mj = expected return sjj = variance of return We are also concerned with the covariance terms: sij= cov (ri, rj) If sij > 0 then returns on i and j are positively correlated. If sij < 0 returns are negatively correlated.
n j=1 R(x) = åmjxj If x1 = x2 = 1, we get Example Decision Variables: xj= # of shares of stock j purchased Expected return of the portfolio: n j=1 n i=1 V(x) = å å sijxixj Variance (measure of risk): V(x) = s11x1x1 + s12x1x2 + s21x2x1 + s22x2x1 = 2 + (-2) + (-2) + 2 = 0 Thus we can construct a “risk-free” portfolio (from variance point of view) if we can find stocks “fully” negatively correlated.
If , then purchasing stock 2 is just like purchasing additional shares of stock 1.
Nonlinear optimization models … 1) Max f(x) = R(x) – bV(x) s.t. å pjxj £ b, xj³ 0, j = 1,…,n where b ³ 0 determined by the decision maker n j=1 Let pj = price of stock j, b = our total budget b = risk-aversion factor (b = 0 risk is not a factor) Consider 3 different models:
Max f(x) = R(x) • s.t. V(x) £ a, å pjxj £ b, xj³ 0, j = 1,…,n • where a ³ 0 is determined by the investor. Smaller values of arepresent greater risk aversion. n j=1 3) Min f(x) = V(x) s.t. R(x) ³ g, å pjxj £ b, xj³ 0, j = 1,…,n where g ³ 0 is the desired rate of return (minimum expectation) is selected by the investor. n j=1