1 / 18

ENGR 610 Applied Statistics Fall 2007 - Week 12

ENGR 610 Applied Statistics Fall 2007 - Week 12. Marshall University CITE Jack Smith. Overview for Today. Review Multiple Linear Regression , Ch 13 (1-5) Go over problem 13.62 Multiple Linear Regression , Ch 13 (6-11) Quadratic model Dummy-variable model Using transformations

siran
Download Presentation

ENGR 610 Applied Statistics Fall 2007 - Week 12

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ENGR 610Applied StatisticsFall 2007 - Week 12 Marshall University CITE Jack Smith

  2. Overview for Today • Review Multiple Linear Regression, Ch 13 (1-5) • Go over problem 13.62 • Multiple Linear Regression, Ch 13 (6-11) • Quadratic model • Dummy-variable model • Using transformations • Collinearity (VIF) • Modeling building • Stepwise regression • Best sub-set regression with Cp statistic • Homework assignment

  3. Multiple Regression • Linear model - multiple dependent variables • Yi = 0 + 1X1i + … + jXji + … + kXki + i • Xji = value of independent variable • Yi = observed value of dependent variable • 0 = Y-intercept (Y at X=0) • j = slope (Y/Xj) • i = random error for observation i • Yi’ = b0 + b1Xi + … + bkXki (predicted value) • The bj’s are called the regression coefficients • ei = Yi - Yi’ (residual) • Minimize  ei2 for sample with respect to all bj j = 1,k

  4. Partitioning of Variation • Total variation • Regression variation • Random variation (Mean response) SST = SSR + SSE Coefficient of Multiple Determination R2Y.12..k = SSR/SST Standard Error of the Estimate

  5. Adjusted R2 • To account for sample size (n) and number of dependent variables (k) for comparison purposes

  6. Residual Analysis • Plot residuals vs • Yi’ (predicted values) • X1, X2,…,Xk • Time (for autocorrelation) • Check for • Patterns • Outliers • Non-uniform distribution about mean • See Figs 12.18-19, p 597-8

  7. F Test for Multiple Regression • F = MSR / MSE • Reject H0 if F > FU(,k,n-k-1) [or p<] • k = number of independent variables • One-Way ANOVA Summary

  8. AlternateF-Test Compared to FU(,k,n-k-1)

  9. t Test for Slope H0: j = 0 See output from PHStat Critical t value based on chosen level of significance, , and n-k-1 degrees of freedom

  10. Confidence and Prediction Intervals • Confidence Interval Estimate for the Slope • Confidence Interval Estimate for the Mean and Prediction Interval Estimate for Individual Response • Beyond the scope of this text

  11. Partial F Tests • Significance test for contribution from individual independent variable • Measure of incremental improvement • All others already taken into account • Fj = SSR(Xj|{Xi≠j}) / MSE SSR(Xj|{Xi≠j}) = SSR - SSR({Xi≠j}) • Reject H0 if Fj > FU(,1,n-k-1) [or p<] • Note: t2 (,n-k-1) = FU(,1,n-k-1)

  12. Coefficients of Partial Determination See PHStat output in Fig 13.10, p 637

  13. Quadratic Curvilinear Regression Model • Yi = 0 + 1X1i + 2X1i2 + i • Treat the X2 term just like any other independent variable • Same R2, F tests, t tests, etc. • Generally need linear term as well

  14. Dummy-Variable Models • Treatment of categorical variables • Each possible value represented by a dummy variable with value of 0 or 1 • Treat added terms like any other terms • Often confounded with other variables, so model may need interaction terms • Add interaction term and perform partial F test and t test for added term

  15. Using Transformations • Square-root • Multiplicative - logY-logX model • Exponential - logY model • Others • Higher polynomials • Trigonometric functions • Inverse

  16. Collinearity (VIF) • Test for linearly dependent variables • VIF - Variance Inflationary Factor • VIFj = 1/(1-Rj2) • Rj = coefficient of multiple determination of variable Xj with all other X variables • VIF > 5 suggests linear dependence (R2 > 0.8) • Full treatment involves analysis of correlation (covariance) matrix, such as • Principle Component Analysis (PCA) • To determine dimensionality and orthogonal factors • Factor Analysis (FA) • To determine rotated factors

  17. See flow chart in text, Fig 13.25 (p 663) Model Building • Stepwise regression • Add or delete one variable at a time • Use partial F and/or t tests (p > 0.05) • Best-subset regression • Start with model including all variables (< n/10) • Eliminate highest variables with VIF > 5 • Generate all models with remaining variables (T) • Select best models using R2 and Cp statistic • Cp = (1-Rk2)(n-T)/(1-RT2) - (n-2(k+1)) • Cp ≤ k+1 • Evaluate each term using t test • Add interaction term, transformed variables, and higher order terms based on residual analysis

  18. Homework • Work and hand in Problem13.63 • Fall break (Thanksgiving) – 11/22 • Review session – 11/29 (“dead” week) • “Linear Regression”, Ch 12-13 • Exam #3 • Linear regression (Ch 12-13) • Take-home • Due by 12/6 • Final grades due by 12/13

More Related