160 likes | 602 Views
Nonlinear least squares. Given m data points (t i , y i ) i=1,2,…m, we wish to find a vector x of n parameters that gives a best fit in the least squares sense to a model m. For example consider the exponential decaying model: m(x,t)=x 1 e -x 2 t
E N D
Nonlinear least squares Given m data points (ti, yi) i=1,2,…m, we wish to find a vector x of n parameters that gives a best fit in the least squares sense to a model m. For example consider the exponential decaying model: m(x,t)=x1e-x2t where x1 and x2 are unknowns . Here n is 2. If x2 were known, the model would be linear. Define the residual r of m components = ri(x) = yi- m(x,ti) and wish to minimize .5 .
Why consider this special case: • common problem • Derivatives have special structure Let J be the Jacobian of r. (Each column of J would give a derivative of an element of x for r). The gradient g = JTr. The matrix of second partials H= JTJ + S where S is zero for exact fit. For the model x1e-x2t J has the form
Gauss Newton Let H = JTJ ( i.e. ignore second term in Hessian) Perform Newton Iteration: Until convergence: Let s be the solution of (J(x)TJ(x))s = -J(x)Tr(x) Set x=x+ s But is just the normal equation form of the linear least squares problem to find s. (J(x)TJ(x))s = -J(x)Tr(x)
Gauss Newton on exponential example with model x1e-x2t If data was t y 0.0 2.0 1.0 0.7 2.0 0.3 3.0 0.1 and initially x= [1 0]T so initially J = = Course of iterations X ||r||22 1.00 0.00 2.390 1.690 -0.610 0.212 1.975 -0.930 0.007 1.994 -1.004 0.002 1.995 -1.009 0.002
Levenberg-Marquardt Let H = JTJ + kI Perform Newton Iteration: Until convergence: Let s be the solution of (J(x)TJ(x)+kI)s = -J(x)Tr(x) Set x=x+ s • Rational • If k is big, just get gradient step which is good far from solution • Data could be noisy and second term is smoother. Project Alert: how do you choose k
How to get derivatives of difficult function: • Automatic differentiation- differentiate the program- Hot topic- good project • Numerical Differentiation • Bite the bullet and hope you can analytically differentiate the function accurately
Numerical Differentiation of sin(1.0) Using derivative= (f(x+h)-f(x))/h As h get smaller, truncation error decreases but roundoff error increases. Choosing h becomes an art
Linear Programming Example: company, which makes steel bands and steel coils, needs to allocate next weeks time on a rolling mill. Bands Coils Rate of Production 200 tons/hr. 140 tons/hr. Profits per ton: $25 $30 Orders: 6000 tons 4000 tons Make x tons of Bands and y tons of Coils to maximize 25x +30y such that x/200 + y/140 <= 40 0 <= x <= 6000 and 0 <= y <= 4000
Linear Programming: maximizing linear function subject to linear constraints • Quadratic Programming: maximizing quadratic function subject to linear constraints • Mathematical programming- maximizing general functions subject to general constraints
Approaches to Linear Programming 1. (Simplex-Dantzig-1940s)Solution lies on boundary of region, so go from vertex to vertex continuing to increase function. Each iteration involves solving a linear system- 0(n3) multiplications As one jumps to next vertex the linear system loses one row and column and gains one row and column-0(n2) multiplications (Golub/Bartels-1970) 2. (Karmarkar-1983)Scale steepest ascent by distance to constraint and go almost to boundary Requires fewer iterations, structure of system does not change