Lecture 21: Review

Lecture 21: Review • Review a few points about regression that I went over quickly concerning coefficient of determination, regression diagnostics and transformation. • Review ANOVA problem. • Review regression problem.

Administrative Info for Midterm II • Time and Location: Wednesday, April 2, 6-8 p.m. Steinberg Hall-Dietrich Hall 351. • Closed book, allowed one 8.5 x 11 double sided note sheet. • Bring calculator • All necessary tables will be provided but nothing additional (e.g., Tukey’s bulging rule will not be provided). • Office hours: Today after class (12:10-2:30), Wednesday 9-11:30

Material Covered • Focus is on Chapter 15 and Chapter 18 (we covered everything except 15.6 and 18.8) • Chapters 13.5-13.6 are not covered. • Be prepared that questions could draw on your knowledge of material from first midterm in context of Chapter 15 and Chapter 18.

Coefficient of Determination (R2) • R2 measures the strength of the linear relationship between Y and X • Formulas for R2: • Square of correlation between X and Y (thus if Cor(X,Y)=-0.5, then R2=0.25) • R2=1-(SSE/SSTOT)=SSR/SSTOT. SSR is called sums of square due to model in JMP output. Information about SSE, SSR, SSTOT can be obtained from Analysis of Variance section of output for regression in JMP.

JMP output for Example 18.2

Impact of Large Sample Sizes • R2 will on average be the same, no matter what the sample size. • However, if there is a linear relationship between X and Y, the p-value for the test for whether the slope is zero will tend to become smaller as the sample size increases. Even if the linear relationship between Y and X is weak (but the slope is not zero), the test will have a small p-value for a large sample size.

Prediction Intervals vs. Confidence Intervals • Prediction Interval: Used when we want to predict one particular value of y given a specific value of x, e.g., a used car dealer wants to predict price of a particular Ford Taurus given that it has 40,000 miles. • Confidence Interval for estimator of expected value of y: Used when we want to estimate the mean of y given x, e.g., a used car dealer wants to bid on a lot of 200 Ford Tauruses with 40,000 miles and wants to know the mean price of a Ford Taurus given that has 40,000 miles.

The prediction interval • The confidence interval Prediction Intervals vs. Confidence Intervals Cont. As the sample size becomes large, the width of the confidence interval tends to zero but the width of the prediction interval tends to

Regression Assumptions and Diagnostics

Influential Points and Outliers • In addition to doing the previous diagnostics, you should check residual plots for influential points and outliers (in y, x and direction of scatterplot). • Influential point: Outlier in direction of x (has high leverage) and does not fall into exactly the same pattern of relationship between y and x as the other points. • Investigate whether outliers and influential points are properly recorded and are representative of the population we are interested in.

Diagnosing Nonlinearity • Check residual plot vs. x to see if there is a pattern.

Transformations • If there is nonlinearity, one possible way to correct for it is to apply a transformation to y or x. • Tukey’s bulging rule (see handout) Match curvature in data to shape of one of the curves drawn in the four quadrants. Apply one of the transformations listed.

Tukey’s Bulging Rule • Curvature appears to match top left quadrant. Try transformation to log X.

Lecture 21: Review

Lecture 21: Review

Presentation Transcript

Introduction, Review of Biomolecules

Review of Last Lecture

Lecture 5 Review of Memory Hierarchy (Appendix C in textbook)

EE104: Lecture 8 Outline

Lecture 21

Lecture: Tunnel FET

Lecture 2: Review of 14 th and 15 th C

EE359 – Lecture 12 Outline

EE104: Lecture 18 Outline

Lecture 6

Review of the Last Lecture

Lecture 15: Course Review

EE104: Lecture 21 Outline

FNCE 333 Lecture 8

Lecture 15: I and We:

BCB 444/544

Methodologies

HCI460: Week 8 Lecture

COMPSCI 345 / SOFTENG 350 Course Review

Review of the previous lecture

Wireless Networks

EAS 140 Engineering Solutions