1 / 82

Outline - Module 2b

Outline - Module 2b. matrix representation for multiple regression least squares parameter estimates diagnostics graphical quantitative further diagnostics testing the need for terms lack of fit test precision of parameter estimates, predicted responses

avi
Download Presentation

Outline - Module 2b

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Outline - Module 2b • matrix representation for multiple regression • least squares parameter estimates • diagnostics • graphical • quantitative • further diagnostics • testing the need for terms • lack of fit test • precision of parameter estimates, predicted responses • correlation between parameter estimates K. McAuley

  2. The Scenario We want to describe the systematic relationship between a response variable and a number of explanatory variables multiple regression we will consider the case which is linear in the parameters K. McAuley

  3. Assessing Systematic Relationships Is there a systematic relationship? Two approaches: • graphical • scatterplots, casement plots • quantitative • determine correlations between response and explanatory variables • consider forming correlation matrix - table of pairwise correlations between regressor and explanatories, and pairs of explanatory variables • fit a model and find out which parameters are significant K. McAuley

  4. Matrix Representation for Multiple Regression Model Equation random noise in i-th observation of response i-th observation of response (i-th data point) i-th value of explanatory variable X 1 i-th value of explanatory variable X p The intercept corresponds to an X which always has the value “1” K. McAuley

  5. Matrix Representation for Multiple Regression We can arrange the observations in “tabular” form - vector of observations, and matrix of explanatory values: K. McAuley

  6. Matrix Representation for Multiple Regression The model can be written as: Nx1 vector Nx1 vector px1 vector Nxp matrix N --> number of data observations p --> number of parameters K. McAuley

  7. Least Squares Parameter Estimates We make the same assumptions as in the straight line regression case. What are they? K. McAuley

  8. Residual Vector Given a set of parameter values , the residual vector is formed from the matrix expression: K. McAuley

  9. Sum of Squares of Residuals … is the same as before, but can be expressed as the squared length of the residual vector: K. McAuley

  10. Least Squares Parameter Estimates Find the set of parameter values that minimize the sum of squares of residuals (SSE) • Necessary conditions for optimal parameter values • system of p equations in p unknowns • Solve to find the optimal parameter values • Fortunately, a general solution to this problem is known and we can use it K. McAuley

  11. Least Squares Parameter Estimates In matrix form, this optimal solution is: Let’s analyze the data considered for a straight line example: Model: K. McAuley

  12. Example - Solder Thickness In matrix form: K. McAuley

  13. Example - Solder Thickness To calculate the Least Squares Estimates: K. McAuley

  14. Example - Solder Thickness The least squares parameter estimates are obtained as: J. McLelllan

  15. Example - Wave Solder Defects Results from a set of designed experiments How do we know the experiment was “designed”? Why would we scale the input variables between -1 and 1? K. McAuley

  16. Example - Wave Solder Defects In matrix form: K McAuley

  17. Example - Wave Solder Defects To calculate least squares parameter estimates: K McAuley

  18. Example - Wave Solder Defects Least squares parameter estimates: K McAuley

  19. Examples – Interesting Things to Notice • if there N runs and p parameters, XTX is a pxp matrix • elements of XTY are for parameters j=1, …, p • in the Wave Solder Defects example, the values of the explanatory variable for the runs followed very specific patterns of -1 and +1, and XTX was a diagonal matrix • in the Solder Thickness example, the values of the explanatory variable did not follow a specific pattern, and XTX was not diagonal K. McAuley

  20. Graphical Diagnostics … are essentially the same as in the straight line case: • residuals vs. predicted responses • residuals vs. each of the explanatory variables in the model • residuals vs. sequence of data collection (run sequence) • residuals vs. variables not in the model In all of these plots, there should be NO trend • if all of the trend has been accounted for by the model, the residuals should look like random noise • no pattern K. McAuley

  21. Quantitative Diagnostics - Ratio Tests Residual Variance Test • is the variance of the residuals significant compared to the inherent noise variance? • same test as for the straight line data • Number of degrees of freedom for the Mean Squared Error is N-p , where p is the number of parameters in the model • compare ratio statistic to FN-p,, where  is the number of data points used to estimate inherent variance and  is the significance level • What should we conclude if this ratio is significant? K. McAuley

  22. Quantitative Diagnostics - Ratio Tests Residual Variance Ratio Mean Squared Error of Residuals (Var. of Residuals): K. McAuley

  23. Quantitative Diagnostics - Ratio Tests Mean Square Regression Ratio Test • Compares variability described by the model with the left-over variability that model cannot account for • same as in the straight line case except for degrees of freedom Variance described by model: K. McAuley

  24. Quantitative Diagnostics - Ratio Test Test Ratio: is compared against Fp-1,N-p, Conclusions? • If ratio is statistically significant, model is better than no model at all, and explains significant trend • NOT statistically significant --> significant trend has NOT been modeled and model may have limited value This test is a coarse measure of whether significant trend(s) has been modeled - it provides no indication of which X variables are important K McAuley

  25. Analysis of Variance Tables The ratio tests involve dissection of the sum of squares: { K McAuley

  26. Analysis of Variance (ANOVA) Tables for Regression ** Area in tail of F-distribution that corresponds to the F-value K McAuley

  27. ANOVA Table for Regression from JMP Tutorial K. McAuley

  28. Quantitative Diagnostics - R2 Coefficient of Determination (“R2 Coefficient”) • square of correlation between observed and predicted values: • relationship to sums of squares: • values typically reported in “%” • ideally R2 is near 100% K. McAuley

  29. Issues with R2 • R2is sensitive to extreme data points, resulting in misleading indication of quality of fit • R2can be made artifically large by adding more parameters to the model • put a curve through every point - “connect the dots” model --> simply modeling noise in the data, rather than trend • solution - use the “adjusted R2”, which penalizes the addition of parameters to the model K. McAuley

  30. Adjusted R2 Adjust for number of parameters relative to number of observations • account for degrees of freedom of the sums of squares • define in terms of Mean Squared quantities • want value close to 1 (or 100%), as before • if N>>p, adjusted R2 is close to R2 K. McAuley

  31. Testing the Need for Groups of Terms In words: “Does a specific term or group of terms account for significant trend in the model”? Test • compare difference in residual variance between full and reduced model • benchmark against an estimate of the inherent variation • if significant, conclude that the group of terms ARE required • if not significant, conclude that the group of terms should be dropped from the model because they don’t explain significant trend • note that remaining parameters should be re-estimated K. McAuley

  32. Testing the Need for Groups of Terms Test: A - denotes the full model (with all terms) B - denotes the reduced model (group of terms deleted) Ratio to calculate: pA, pB are the numbers of parameters in models A, B s2 is an estimate of the inherent noise variance: • estimate as SSEA/(N-pA) What should we do if this ratio is big? What should we do if it’s small? K. McAuley

  33. Testing the Need for Groups of Terms Compare this ratio to • if MSEA is used as estimate of inherent variance, then degrees of freedom of inherent variance estimate is N-pA K. McAuley

  34. Lack of Fit Test If we have replicate runs in our regression data set, we can break out the noise variance from the residuals, and assess the component of the residuals due to unmodelled trend Replicates - • repeated runs at the SAME experimental conditions • indication of inherent variance because no other factors are changing • measure of reproducibility of experiments K McAuley

  35. Using Replicates We can estimate the sample variance for each set of replicates, and pool the estimate of the variance • constancy of variance can be checked using Bartlett’s test • constant variance is assumed for ordinary least squares estimation For each set of replicates, we have: K. McAuley

  36. Using a Data Set with Replicates The pooled estimate of from m variance estimates is: Does this seem like a reasonable thing to do? K. McAuley

  37. The Lack of Fit Test { SSE Back to the sum of squares “block”: { SSEP SSELOF “lack of fit” sum of squares “pure error” sum of squares K McAuley

  38. The Lack of Fit Test We partition the SSE into two components: • component due to inherent noise • component due to unmodeled trend Pure error sum of squares (SSEP): i.e., add together sums of squares associated with each replicate group (there are “m” replicate groups in total) K. McAuley

  39. The Lack of Fit Test The “lack of fit sum of squares” (SSELOF) is formed by subtracting Degrees of Freedom: - for SSEP: - for SSELOF: K McAuley

  40. The Lack of Fit Test The test ratio: Compare to • What does it mean if this ratio is big? • What does it mean if this ratio is small? • What is the null hypothesis for this test? • What can we conclude if the ratio is bigger than the value from the tables with =0.05? K. McAuley

  41. Example - Wave Solder Defects Remember these data from a designed experiment K McAuley

  42. Example - Wave Solder Defects From earlier regression, SSE = 2694.0 and SSR = 25306.5 This was done by hand - Excel has no Lack of Fit test that uses replicates K McAuley

  43. A Comment on the Ratio Tests Order of Preference (or “value”) - from most definitive to least definitive: • Lack of Fit Test -- MSELOF/MSEP • MSE/s2inherent • MSR/MSE How does the type of data that you have influence the test you can do? If at all possible, try to include replicate runs in your experimental program K McAuley

  44. The Parameter Estimate Covariance Matrix … summarizes the variance-covariance structure of the parameter estimates K. McAuley

  45. Properties of the Covariance Matrix • symmetric -- Cov(b1,b2) = Cov(b2,b1) • diagonal entries  0 • off-diagonal entries can be +ve or -ve • matrix is positive definite for any vector v K. McAuley

  46. Parameter Estimate Covariance Matrix The covariance matrix of the parameter estimates is defined as: Compare expression with variance for single parameter: For linear regression, the covariance matrix for the parameter estimates is: What can we do if we want uncorrelated parameter estimates? K. McAuley

  47. Parameter Estimate Covariance Matrix Key point - the covariance structure of the parameter estimates is governed by the experimental run conditions used for the explanatory variables - the Experimental Design Example - the Wave Solder Defects data Parameter estimates are uncorrelated, and variances of the non-intercept parameters are the same (for the coded variables) K. McAuley

  48. Estimating the Parameter Covariance Matrix The X matrix is known perfectly (after the run conditions have been selected), so the only estimated quantity is the inherent noise variance • Get it from replicates in the data set, external estimate, or MSE • Why might the MSE be less reliable than other values of s2 ? • Why might MSE be better to use? For wave solder defect data, the MSE is 384.86 with 7 degrees of freedom, so the estimated parameter covariances are: K. McAuley

  49. Using the Covariance Matrix Variances of parameter estimates • are obtained from the diagonals of the covariance matrix • square root is the standard dev’n, or “standard error”, of the parameter estimates • use for confidence intervals of the paramters • use in hypothesis tests for the parameters Correlations between the parameter estimates • can be obtained by taking covariance from appropriate off-diagonal element, and dividing by the standard errors of the individual parameter estimates • Why do people like to see correlations more than they like covariances? K. McAuley

  50. Confidence Intervals for Parameters … similar procedure to straight line case: • given standard error for parameter estimate, use appropriate t-value, and form interval as: The degrees of freedom for the t-statistic come from the estimate of the inherent noise variance • the degrees of freedom will be the same for all of the parameter estimates If the confidence interval contains zero, the parameter is plausibly zero. Perhaps this term should be deleted from the model. K. McAuley

More Related