150 likes | 298 Views
Global predictors of regression fidelity. A single number to characterize the overall quality of the surrogate. Equivalence measures Coefficient of multiple determination Adjusted coefficient of multiple determination Prediction accuracy measures Model independent: Cross validation error
E N D
Global predictors of regression fidelity • A single number to characterize the overall quality of the surrogate. • Equivalence measures • Coefficient of multiple determination • Adjusted coefficient of multiple determination • Prediction accuracy measures • Model independent: Cross validation error • Model dependent: Standard error
Linear Regression • Surrogate is linear combination of given shape functions • For linear approximation • Difference (error) between data and surrogate • Minimize square error • Differentiate to obtain
Coefficient of multiple determination • Equivalence of surrogate with data is often measured by how much of the variance in the data is captured by the surrogate. • Coefficient of multiple determination and adjusted version
R2 does not reflect accuracy • Compare y1=x to y2=0.1x plus same noise (normally distributed with zero mean and standard deviation of 1. • Estimate the average errors between the function (red) and surrogate (blue). R2=0.9785 R2=0.3016
Cross validation • Validation consists of checking the surrogate at a set of validation points. • This may be considered wasteful because we do not use all the points for fitting the best possible surrogate. • Cross validation divides data into nggroups. • Fit the approximation to ng -1 groups, and use last group to estimate error. Repeat for each group. • When each group consists of one point, error often called PRESS (prediction error sum of squares) • Calculate error at each point and then present r.m.s error • For linear regression can be shown that
Model based error for linear regression • The common assumptions for linear regression • Surrogate is in functional form of true function • The data is contaminated with normally distributed error with the same standard deviation at every point. • The errors at different points are not correlated. • Under these assumptions, the noise standard deviation (called standard error) is estimated as. • Similarly, the standard error in the coefficients is
Comparison of errors • For the example in slide 4 of y=x plus the Gaussian noise the fit was=0.5981+0.9970x. • The noise came from randn, set to zero mean and unit standard deviation. However it had a mean of 0.552 and a standard deviation of 1.3. • The standard error is calculated as 1.32 and the cross validation (PRESS) error as 1.37. • With less data, the differences will be larger. • The actual error was only about 0.6 because the large amount of data filtered the noise.
Top hat question • We sample the function y=x with noise at x=0, 1, 2 to get 0.5, 0.5, 2.5. • Assume that the linear regression fit is y=0.8x. • What are the noise (epsilon), the discrepancy (e), the cross-validation error, and the actual error at x=2.
Prediction variance • Linear regression model • Define then • With some algebra • Standard error
Example of prediction variance • For a linear polynomial RS y=b1+b2x1+b3x2find the prediction variance in the region • (a) For data at three vertices (omitting (1,1))
Interpolation vs. Extrapolation • At origin . At 3 vertices . At (1,1)
Standard error contours • Minimum error obtained by setting to zero derivative of prediction variance with respect to . • What is special about this point • Contours of prediction variance provide more detail.
Data at four vertices • Now • And • Error at vertices • At the origin minimum is • How can we reduce error without adding points?
Graphical Comparison of Standard Errors Three points Four points
Problems • The pairs (0,0), (1,1), (2,1) represent strain (millistrains) and stress (ksi) measurements. • Estimate Young’s modulus using regression. • Calculate the error in Young modulus using cross validation both from the definition and from the formula on Slide 5. • Repeat the example of y=x, using only data at x=3,6,9,…,30. Use the same noise values as given for these points in the notes for Slide 4.