120 likes | 132 Views
Learn how to fit curves using general methods, trial and error, and grid search approach for arbitrary functions optimization. Explore minimizing chi-square, finding optimal parameters, and reducing residual errors.
E N D
Physics 114: Lecture 18 Least Squares Fit to Arbitrary Functions Dale E. Gary NJIT Physics Department
Nonlinear Least Squares Fitting • The general means to fit curves, surfaces, or higher dimensions to data relies on minimizing chi-square, but there is no closed-form method to calculate the coefficients. As usual, for a function y(x), say, the chi-square is • The techniques we developed in Chapters 6 and 7 only work when the parameters in the function y(x) are linear, i.e. obey • When this is not the case, i.e. when y(x) depends on products or powers in parameters), then the minimization of chi-square results in coupled equations that in general cannot be solved. We may sometimes be able to linearize it, as we saw last time, but generally we need to proceed by trial and error.
Searching Parameter Space • The basic approach could not be simpler, philosophically. The idea is to simply calculate a trial value of the function for a given set of values of parameters, calculate the chi-square, and repeat for a large number of parameters until you find the minimum chi-square. • As a concrete example, consider a Gaussian function with parameters a, b and c: fit to a set of data yi= 0.0008, 0.0211, 0.2327, 1.0546, 1.9648, 1.5049, 0.4739., with equal errors si = s = 0.2. • Try a set of parameters a = 1, b = 1, c = 1. With these parameters, the function y(x) gives y(x) = 0.0000, 0.0001, 0.0183, 0.3679, 1.0000, 0.3679, 0.0183. • The reduced chi-square is then • A plot of the curves yiand y(x) are shown at right.
Searching Parameter Space (cont’d) • Now we simply try other sets of parameters and continue to calculate chi-square, attempting to find a minimum chi-square such that the reduced chi-square is about 1. • The plot shows the result of trying other values of a from 1 to 3, stepping by 0.5, while keeping b = 1 and c = 1. • The reduced chi-square for this set of five curves is • Looks like a = 2.5, b = 1, c = 1 is the best so far. • Now we change one of the other parameters, by say, setting b = 1.2, and varying a again, with c = 1. • The plot is shown at right, and the chi-square for this new set of curves is • Looks like a = 2.5, b = 1.2, c = 1 is the best so far.
Searching Parameter Space (cont’d) • If we repeat this 3 more times, increasing b by 0.2 each time, we end up with the following 25 values of reduced chi-square: c=1 • Now let’s try c=1.2, and repeat the whole thing again: c=1.2 b=1.0 b=1.2 b=1.4 b=1.6 b=1.8 a=1.0 18.4311 17.9506 18.3422 19.6973 21.8997 a=1.5 10.1180 9.3389 9.8318 11.8644 15.2623 a=2.0 5.7779 4.6612 5.1924 7.9024 12.5586 a=2.5 5.4107 3.9175 4.4241 7.8115 13.7888 a=3.0 9.0164 7.1078 7.5267 11.5916 18.9528 b=1.0 b=1.2 b=1.4 b=1.6 b=1.8 a=1.0 15.3064 14.8579 15.3022 16.6286 18.7497 a=1.5 6.5330 5.8522 6.5054 8.4939 11.6843 a=2.0 2.4672 1.5487 2.4021 5.0517 9.3174 a=2.5 3.1090 1.9475 2.9921 6.3022 11.6491 a=3.0 8.4585 7.0485 8.2756 12.2454 18.6794
Searching Parameter Space (cont’d) • After additional trial c values, we find a best fit of a = 2.0, b = 1.2, c = 1.4. Here is the final fit with these values. • But note that once we have the parameters, we can plot a much smoother Gaussian through the points by evaluating it at more x values. • Note that we could be even more precise by stepping with smaller steps.
Complete Program for c=1.4 • yi = [-0.1474 0.0367 0.3146 1.0253 1.9107 1.4740 0.3642]; • x = -3:3; • chi2 = zeros(10,10); • c = 1.4; • i = 0; • for a = 0.75:0.25:3 • i = i+1; • j = 0; • for b = 0.9:0.1:1.8 • j = j+1; • chi2(i,j) = sum(((yi-a*exp(-((x-b)/c).^2))/0.1).^2)/4; • end • end
Searching Parameter Space (cont’d) • Here is the reduced chi-square shown for smaller steps, covering the same range of a and b, but now as a 10 x 10 plot, and shown as an image. • The minimum chi-square is in a dip on a surface in “parameter space”. cn2= 1.29
Grid Search Method • The example we just looked at involved 3 parameters, and hence the parameter space is 3-dimensional (one dimension each of a, b and c). We could easily try to fit more complex functions that would have many more parameters. • For example, for homework you will fit a triple Gaussian! Each has 3 parameters, so there are a total of 9. The search space for a solution (the parameter space) is now 9-dimensional! Obviously, the fitting approach we just did will work, but will it be very tedious. Let’s look at some ways to be more clever and speed up the process. • First, notice that we spent a lot of time calculating chi-square for parameters that got worse and worse, near the edges of the grid. We could select starting parameters, calculate chi-square, and step one parameter until it chi-square starts increasing. If it is the first step, reverse and go the other direction. This is like traveling downhill in one parameter. Keep going until you find the minimum, then start on another parameter.
Grid Search Method • You can refine this method to find the minimum more precisely. Once you get close, you can make the step size smaller, and essentially zoom in to finer precision. • Finally, you can improve even more by doing a gradient search, essentially going downhill in multiple parameters at once.
Lack of Convergence—Local Minima • For complex fitting functions, it is very easy to find a minimum that is not the true minimum, but only a local minimum. It is analogous to finding yourself in a dip on the top of a mountain, when there is a deep valley nearby. • If your initial parameter guess is near such a dip, the chi-square will be minimized in that local minimum, but the reduced chi-square will be nowhere near 1. • One way to guard against that is to be clever about your starting position. Recall our initial curve and initial trial of a = b = c = 1. • Looking at the blue curve to be fitted, our initial guess for b and c are pretty good, but a clearly should be greater, near a = 2. • Generally, it is usually good enough to choose bounds on the parameters.
Bounds on the Parameters • For our example of fitting a Gaussian, clearly the height and width should always be positive. Many autofitting routines, including the MatLAB routine fit(), allow one to set the lower and upper bounds on parameters. • Let’s use the MatLAB routine for a more complicated general fitting of a Gaussian on top of a baseline. You will do something similar for homework. • Create the data: • x = -5:0.1:5; • y = 32.1*exp(-((x-0.23)/0.87).^2)+5.4; • dev = 1.5*randn(1,101) • plot(x,y+dev,'.') • Create a “fittype” and call fit(): • f = fittype('gauss2'); • gfit = fit(x',(y+dev)',f) • Alternatively, set lower bound first • options = fitoptions('gauss2') options.Lower = [0 -Inf 0 0 -Inf 0]; For fit types, use cflibhelp