1 / 20

Chapter 4 – Correlation and Regression

Chapter 4 – Correlation and Regression. before: examined relationship among 1 variable (test grades, metabolism, trip time to work, etc.). now: will examine relationship between 2 variables (study time and test grades, age and metabolism, trip time to work and distance to work, etc.).

menefer
Download Presentation

Chapter 4 – Correlation and Regression

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 4 – Correlation and Regression before: examined relationship among 1 variable (test grades, metabolism, trip time to work, etc.) now: will examine relationship between 2 variables (study time and test grades, age and metabolism, trip time to work and distance to work, etc.)

  2. The 2 Variables • Response variable – measures an outcome of a study • Explanatory variable – explains or influences changes in a response variable • Ex. y=2x+4 • explanatory: x response: y Ex. The number of hours you study and the grade you earn explanatory: hours studied response: grade Ex. Safety training hours at an industrial plant and the number of work hours lost due to accidents. explanatory: traininghours response: work hours

  3. Ways to examine 2 variables • Form – shape (linear, exponential, parabola, none) • Direction – positive or negative slope • Strength – how tight do the points fit the line of best fit Terminology: graph “y against x” means:

  4. Scatter plot • Shows relationship between two quantitative variables • Each dot represents an individual data point (x,y) Positive Negative None

  5. Strength & Direction of Linear Relationship • Measured by the correlation coefficient; r • Expanding this formula for 3 data points yields:

  6. Find the r value

  7. Facts about r • Value is always between: -1 and 1 • If r is negative, then there is a negative relationship • If r is positive, then there is a positive relationship • If r = -1 or r = 1, then all points lie on a straight line

  8. Facts about r • Strength of correlation: • Values close to -1 or 1 signify a strong linear rel. • If r = -1 or r = 1, then all points lie on a straight line • Values close to 0 signify a weak linear rel. For the sake of this class -1 -0.9 -0.7 0 0.7 0.9 1 Moderate Strong Moderate Strong Weak

  9. Lurking Variables • Def: neither explanatory or response, but may be responsible for changes in these variables. • Ex. In the past few years, the population of Lynchburg has increased. It was observed that during this time there was a correlation between the number of people attending church and the number of people in jail. • Hopefully church attendance doesn’t cause people to go to jail. • Lurking Variable – population growth

  10. Facts continued • No distinction between explanatory and response variable (you will get the same r value if you swap the two variables) • r has no unit • Not resistant to outliers • Is not a complete description of two-variables

  11. Find the r value of the reality data

  12. Least squares regression line(LSRL) • Makes the sum of the square distances of the vertical lines the smallest • Used to predict the value of y.

  13. How to find this line • Recall: any line • Regression line: **** USE CAUTION WITH THE “b” ****

  14. Example • Make a scatter plot on your calculator. • Find the equation for the regression line and then graph it on your scatter plot. • What may be a good list price for a 1,700 sq ft home? 2,500 sq ft home?

  15. Facts about LSRL • Distinction between explanatory and response different than • Even though graphs will change the value for the regression r, will not. • Close connection between slope and correlation

  16. LSRL Facts continued • LSRL always passes through point: • r2 is a measure of the proportion of variation that is explained by the regression line. • “how much of r is explained by the points” • if r = -0.74 then r2 = 0.56 which means that 56% of the variations are accounted for by the LSRL.

  17. Residuals • Residual = observed value – predicted value • If residual is a positive (+) number, point is above line • If residual is a negative (-) number, point is below line • The mean of residuals is always zero

  18. Extrapolation • Def: Use of LSRL to predict results outside the range of values used to calculate the LSRL • Such predictions are not accurate • Ex. • Predict the value of y when x=10 • Since you used x-values of 1-4 to find the LSRL, it is not accurate to predict what y will be at an x-value of 10.

  19. Association does not imply causation • No cause and effect. Changes in explanatory variable (x) will not always cause changes in response variable (y) • Ex. The more TV’s a country has, the longer people live. So to improve the life expectancy in other countries ship more TV’s to them.

  20. HW • Pg 144 #’s: 1,2,5,6,13b,14b,16b,17b,20c • Pg 160 #’s: 3,9-12 parts ce, 18cf • Excel: create a scatter plot with trend line and r2 of data in guided exercise 4 on page 157. Directions are on page 159.

More Related