400 likes | 834 Views
Least Square Method for Parameter Estimation. Docent Xiao-Zhi Gao Department of Automation and Systems Technology. Fitting Experimental Data. Scatterplot shows some patterns in experimental data When we see a straight line pattern, we want to model the data with a linear equation
E N D
Least Square Method for Parameter Estimation Docent Xiao-Zhi Gao Department of Automation and Systems Technology
Fitting Experimental Data • Scatterplot shows some patterns in experimental data • When we see a straight line pattern, we want to model the data with a linear equation • Fitting experimental data with models allows us to understand data and make predictions
Fitting Experimental Data Do the curve fit a group of given data samples in the best way?
Gauss and Legendre • The method of least square was first published by Legendre in 1805 and by Gauss in 1809 • Both mathematicians applied it to determine the orbits of bodies about the sun • Gauss went on to publish further development of this method in 1821
Fitting Experimental Data • Given: data points, functional form • Goal: find constants in function • Example: given (xi, yi), find line through them; i.e., find a and b in y = ax+b (x6,y6) y=ax+b (x3,y3) (x5,y5) (x1,y1) (x7,y7) (x4,y4) (x2,y2)
Fitting Experimental Data • Example: measure positions of falling objects,fit parabola time p = –1/2 gt2 position Estimate g from fitting the position
Least Square Line • How can we find the best line to fit the data? • We would like to minimize the total distance away from the line • This distance is measured vertically from the point to the line
An Example of Least Squares Line Consider four data samples (1 ,2.1), (2,2.9), (5,6.1), and (7,8.3) , with the fitting line f(x) = 0.9x + 1.4 The squared errors are: x1=1 f(1)=2.3 y1=2.1 e1= (2.3 – 2.1)² = .04 x2=2 f(2)=3.2 y2=2.9 e2= (3.2 – 2.9)² =. 09 x3=5 f(5)=5.9 y3=6.1 e3= (5.9 – 6.1)² = .04 x4=7 f(7)=7.7 y4=8.3 e4= (7.7 – 8.3)² = .36 The total squared error is 0.04 + 0.09 + 0.04 + 0.36 = 0.53 By finding better coefficients of the fitting line, we can make this error even smaller But how to find the best coefficients so that this error is minimized?
Minimizing Vertical Distances between Points and Line • E = (d1)² + (d2)² + (d3)² +…+(dn)² for n data points • E = [f(x1) – y1]² + [f(x2) – y2]² + … + [f(xn) – yn]² • E = [mx1 + b – y1]² + [mx2 + b – y2]² +…+ [mxn + b – yn]² • E= ∑(mxi+ b – yi )²
Minimization of E • How do we deal with this? E = ∑(mxi+ b – yi )² • Treat x and y as constants, since we are trying to find the optimal m and b • Partials of E wrt. to m and b E/m = 0 and E/b = 0
Minimization of E • E/b = ∑2(mxi + b – yi) = 0 m∑xi + ∑b = ∑yi->mSx + bn = Sy • E/m = ∑2xi (mxi + b – yi) = 2∑(mxi² + bxi – xiyi) = 0 m∑xi² + b∑xi = ∑xiyi->mSxx + bSx = Sxy
A Simple Example • Find the linear least squares fit to the data: (1,1), (2,4), (3,8) Sx = 1+2+3= 6 Sxx = 1²+2²+3² = 14 Sy = 1+4+8 = 13 Sxy = 1(1)+2(4)+3(8) = 33 n = number of points = 3 The line of best fit is y = 3.5x – 2.667
Linear Least Squares • General pattern: • Note that dependence on unknowns should be linear
Solving Linear Least Squares • Take partial derivatives:
Solving Linear Least Square • For convenience, rewrite as matrix: • Factors:
Solving Linear Least Squares • for each xi,yicompute f(xi), g(xi), etc. store in row i of A store yi in b • compute (ATA)-1 ATb • (ATA)-1 AT is known as “pseudoinverse” of A
A Special Case: Zero Order • In case of fitting data with a constant (zero order) • In this case, f(xi) = 1, and we have • Mean is the least square estimation for best constant fit (k-means clustering)
Weighted Least Square • Common case: (xi,yi) may have different uncertainties associated with them • Want to give more weights to measurements of which you are more certain • Weighted least square minimization
Weighted Least Squares • Define weight matrix W as • Solve weighted least squares via
Conclusions • Least square method is the most commonly used parameter estimation method • It seeks to minimize the measures of the differences between the approximating function and given data points • Least square method is not the best fit for data that is not linear
Computer Exercise • Least square method for parameter estimation of Hooke’s law • Hooke’s law of spring • Our task is to estimate k0 and k1, based on the measurement data of l and f
Computer Exercise • Estimate parameters of k0 and k1 • Extend the first order polynomial to higher order polynomials: • Estimate k0, k1,k2, ..., up to the fourth order polynomial