1 / 19

Data Modeling and Least Squares Fitting 2

Data Modeling and Least Squares Fitting 2. COS 323. Nonlinear Least Squares. Some problems can be rewritten to linear Fit data points (x i , log y i ) to a * +bx, a = e a* Big problem: this no longer minimizes squared error!. Nonlinear Least Squares.

zephr-fry
Download Presentation

Data Modeling and Least Squares Fitting 2

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Modeling andLeast Squares Fitting 2 COS 323

  2. Nonlinear Least Squares • Some problems can be rewritten to linear • Fit data points (xi, log yi) to a*+bx, a = ea* • Big problem: this no longer minimizessquared error!

  3. Nonlinear Least Squares • Can write error function, minimize directly • For the exponential, no analytic solution for a, b:

  4. Newton’s Method • Apply Newton’s method for minimization:where H is Hessian (matrix of all 2nd derivatives) and G is gradient (vector of all 1st derivatives)

  5. Newton’s Method for Least Squares • Gradient has 1st derivatives of f, Hessian 2nd

  6. Gauss-Newton Iteration • Consider 1 term of Hessian: • If close to answer, first term close to 0 • Gauss-Newtion method: ignore first term! • Eliminates requirement to calculate 2nd derivatives of f • Surprising fact: still superlinear convergence if“close enough” to answer

  7. Levenberg-Marquardt • Newton (and Gauss-Newton) work well when close to answer, terribly when far away • Steepest descent safe when far away • Levenberg-Marquardt idea: let’s do both Steepestdescent Gauss-Newton

  8. Levenberg-Marquardt • Trade off between constants depending on how far away you are… • Clever way of doing this: • If  is small, mostly like Gauss-Newton • If  is big, matrix becomes mostly diagonal,behaves like steepest descent

  9. Levenberg-Marquardt • Final bit of cleverness: adjust  depending on how well we’re doing • Start with some , e.g. 0.001 • If last iteration decreased error, accept the step and decrease to /10 • If last iteration increased error, reject the step and increase to 10 • Result: fairly stable algorithm, not too painful (no 2nd derivatives), used a lot

  10. Outliers • A lot of derivations assume Gaussian distribution for errors • Unfortunately, nature (and experimenters)sometimes don’t cooperate • Outliers: points with extremely low probability of occurrence (according to Gaussian statistics) • Can have strong influence on least squares probability Gaussian Non-Gaussian

  11. Robust Estimation • Goal: develop parameter estimation methods insensitive to small numbers of large errors • General approach: try to give large deviations less weight • M-estimators: minimize some function other than square of y – f(x,a,b,…)

  12. Least Absolute Value Fitting • Minimizeinstead of • Points far away from trend get comparativelyless influence

  13. Example: Constant • For constant function y = a,minimizing (y–a)2 gave a = mean • Minimizing |y–a| gives a = median

  14. Doing Robust Fitting • In general case, nasty function:discontinuous derivative • Simplex method often a good choice

  15. Iteratively Reweighted Least Squares • Sometimes-used approximation:convert to iterated weighted least squareswith wi based on previous iteration

  16. Iteratively Reweighted Least Squares • Different options for weights • Avoid problems with infinities • Give even less weight to outliers

  17. Iteratively Reweighted Least Squares • Danger! This is not guaranteed to convergeto the right answer! • Needs good starting point, which is available ifinitial least squares estimator is reasonable • In general, works OK if few outliers, not too far off

  18. Outlier Detection and Rejection • Special case of IRWLS: set weight = 0 if outlier, 1 otherwise • Detecting outliers: (yi–f(xi))2 > threshold • One choice: multiple of mean squared difference • Better choice: multiple of median squared difference • Can iterate… • As before, not guaranteed to do anything reasonable, tends to work OK if only a few outliers

  19. RANSAC • RANdom SAmple Consensus: desgined forbad data (in best case, up to 50% outliers) • Take many random subsets of data • Compute least squares fit for each sample • See how many points agree: (yi–f(xi))2 < threshold • Threshold user-specified or estimated from more trials • At end, use fit that agreed with most points • Can do one final least squares with all inliers

More Related