Physics 114: Exam 2 Review Lectures 11-16

Physics 114: Exam 2 Review Lectures 11-16 Dale E. Gary NJIT Physics Department

Concepts Covered on the Exam • The list of concepts covered on the exam are: • Propagation of Error • Relative error, relative variance • Percent error, signal to noise ratio • Estimating error, graphical representation of error • Mean of means, standard error (standard deviation of the mean) • Weighted mean and Error • Probability distribution, error probability • Chi-square probability statistic • Least-squares fitting, minimizing chi-square, linear least-squares fitting • Degrees of freedom • We will review each of these in turn

Propagation of Error • The general expression for propagation of error applied to a function of two variables x = f(u,v) is • For the purposes of the exam, we will explicitly assume that the errors are uncorrelated, i.e. suv = 0. Thus, we will consider only • You should know how to extend this to more than two variables. • The terms in brackets are partial derivatives, which means take the derivative assuming all other variables are contant. • Example: x = uv2.

Relative Error, Relative Variance • The relative error is just the ratio of the standard deviation to the mean, i.e. • The relative variance is the square of this quantity. Relating to our formula for propagation of error, we can write the relative variance for the previous example by dividing through by x2: • We see that products of powers (including negative powers like x = u/v) give a simple relationship between relative variances, and this is handy for estimating error (which we’ll discuss shortly), but things are a little more complicated for other forms. • You should be able to calculate the relative variance and/or relative error for any function x = f(u,v).

Percent Error, Signal to Noise Ratio • The percent error is just the relative error expressed as a percentage. Thus, a relative error of 0.1 would be the same as a 10% error. • In our previous example, if the relative errors in u and v are both 10%, the resultant error in x = uv2 is • The relative variance is thus 5%, and the relative error is the square root of this quantity, or 22%. You can see that the error is dominated by errors in the variable v. Errors in variables raised to a power dominate, with larger powers dominating more, assuming roughly equal errors in u and v. • The signal to noise ratio (SNR) is just the inverse of the relative error, so a 10% error (0.1 relative error) corresponds to a 10:1 signal to noise ratio. • You should be able to effortlessly convert back and forth from relative error, percent error, or SNR to any of the other quantities.

Estimating Error • You should have a sense of the way errors combine in measurements. If quantities are measured with similar errors, then linear combinations of the quantities increase the error quadratically, while powers and other non-linear combinations may make one quantity’s error more important than another. • Thus, measuring the volume of a sphere will be rather inaccurate if you measure the circumference C, since r = C/2p and V =(4p/3)r3. A 1% error in circumference gives a 3% error in volume. • You are better off measuring the volume directly, perhaps by submerging it in water and measuring the volume of the water displaced (if you can measure that to similar 1% accuracy).

Graphical Representation of Error • We also discussed how to think about the errors graphically. If the relative errors of two quantities u and v are independent (suv = 0) and equal (su = sv = s), then plotting u vs. v gives the left-hand figure, but if, say, (su = 2sv) then the scatter is as in the right-hand figure. If suv ≠ 0, then the pattern would be tilted at some angle. Equal scatter, forms a circular pattern Unequal scatter, forms an elliptical pattern, no slope

Mean of Means and Standard Error • One can join multiple sets of measurements to refine both the estimated value (mean of means) and the standard error (standard deviation of the mean). • The mean of means is given by, , where the xi are individual measurements of the mean. • If the standard deviations of each of the measurements are all the same (si = s), then they cancel and we have the usual • Likewise, the rule for combining data sets with different errors is • And for equal errors this is • This last is a key result to remember—combining measurements reduces the standard deviation by the square root of the number of measurements. • Do example: x = [10.7, 7.2, 11.2, 9.9, 11.3], s = [2, 2, 2, 1.5, 2.5]. Ans: 10.0±1.4

Weighted Mean and Error • Perhaps the errors themselves are not known, but the relative weighting of the measurements is known. For example, say you want to combine means taken with different numbers of measurements (or different integration times). Defining the weights as proportional to the variances kwi = si2, the proportionality constant cancels and we have • We can then define an average standard deviation: • After obtaining that average standard deviation, the standard error (standard deviation of the mean) is, as before, decreased by the square-root of the number of measurements:

Probability Distribution • The Gaussian distribution (bell curve) shows the expected distribution of measurements about the mean. This can be interpreted as a probability. • Thus, ~68% of measurements should fall within 1s of the mean, i.e. • Likewise, ~95% of measurements should fall within 2s of the mean. • In science, it is expected that errors are given in terms of ±1s. Thus, stating a result as 3.4±0.2 means that 68% of values fall between 3.2 and 3.6. • In some disciplines, it is common instead to state 90% or 95% confidence intervals (1.64s, or 2s). In the case of 90% confidence interval, the same measurement would be stated as 3.4±0.37. To avoid confusion, one should say 3.4±0.37 (90% confidence level).

Chi-Square Probability • Chi-square is a criterion for the goodness of fit of a function, e.g. y(x), and is defined as • In other words, it is just the sum of the squared deviations of points from the function, normalized by the variances. When the fit is good, we normally expect the squared deviations to average around s 2, so each term is about 1 and the total chi-square is about equal to n, the number of degrees of freedom. • For the special case of a linear fit to a set of points (y(x) = a + bx), • We can find the best fit straight line by minimizing chi-square. • Generally, we can find the best fit of any function by replacing y(x) with another equation representing that function.

Linear Least Squares Fitting • Minimizing chi-square, we found that we could solve for the parameters a and b that minimize the difference between the fitted line and the data (with errors si) as: where • In the case of equal errors, they cancel and we can drop the s and replace with N. • The uncertainties in the parameters are:

MatLAB Commands for Linear Fits • MatLAB has a “low level” routine called polyfit() that can be used to fit a linear function to a set of points (assumes all errors are equal): • x = 0:0.1:2.5; • y = randn(1,26)*0.3 + 3 + 2*x; % Makes a slightly noisy linear set of points y = 3 + 2x • p = polyfit(x,y,1) % Fits straight line and returns values of b and a in p p =2.0661 2.8803 % in this case, fit equation is y = 2.8803 + 2.0661x • Plot the points, and overplot the fit using polyval() function. • plot(x,y,'.'); • hold on • plot(x,polyval(p,x),'r') • Here is the result. The points have a scatter of s = 0.3 around the fit, as we specified above.

Reduced Chi-Square • Recall that the value of c2 is • It is often easier to consider the reduced chi-square, which is about unity for a good fit. • If we compare points to a fit of a sine function, changing parameters changes c2, and obviously the minimum chi-square is the best fit. change amplitude change frequency

Degrees of Freedom n • The number of degrees of freedom represent the ways in which things can be varied independently. It is generally the number of independent data points (the number of measurements), reduced by the number of parameters deduced from the measurements. • Thus, if the data points are used to determine a mean, then the data points can be varied, but are constrained to have the given mean. This constraint must be subtracted from the number of points, so in this case the number of degrees of freedom is n = N – 1. • If we use the data points to define a line (i.e. solve for two parameters a and b for the line), then n = N – 2. • You should learn to recognize the number of parameters needed to fully describe a function. The sine wave of the previous example can be adjusted in three ways (has three parameters). We showed two (amplitude and frequency). Can you guess the third? For this fit, we have n = N – 3. • We need to know this in order to use the reduced chi-square as a measure of when we have an acceptable fit.

Physics 114: Exam 2 Review Lectures 11-16