1 / 19

MSP 5410 Statistics

MSP 5410 Statistics. Lecture 11 Dr. Chappell. When a scenario indicates comparison of 3 or more groups, this is the statistical test that you should select. The previous note slides gave more detailed information about ANOVA, for example, what ANOVA stands for. About ANOVA.

maddy
Download Presentation

MSP 5410 Statistics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MSP 5410Statistics Lecture 11 Dr. Chappell

  2. When a scenario indicates comparison of 3 or more groups, this is the statistical test that you should select. The previous note slides gave more detailed information about ANOVA, for example, what ANOVA stands for. About ANOVA

  3. Introduction to regression analysis – Interpreting printouts Chapter 18

  4. Simple linear regression • Simple regression • Ordinary least squares (p. 333) • Linear regression (p. 333) • Involves two variables that are measured at the interval level How to determine the relationship between two interval-level variables (p.323)

  5. First variable, the variable you will use to predict the other variable, can be referred to as the: • Independent variable (IV) • Predictor variable • Regressor variable • X-variable • The second variable is the one you are predicting. It is sometimes referred to as the: • Dependent variable (DV) • Response variable • Predicted variable • Y-variable Simple Regression (involving the statistical/prediction relationship, not the functional relationship)

  6. Page 324 • You might review this exercise. Independent vs dependent variable exercise

  7. For prediction (*It is important to know that it is for this purpose that regression should be indicated as the appropriate statistical test to use for hypothesis testing.) • Our discussion focuses on simple regression, involving only 2 variables • Variable X • Variable Y • Want to predict Y, given X When to Use Regression Analysis

  8. Examining the co-relation of two variables (There’s something called correlational analysis that is related to regression.) • Shape • Direct/positive(Data points move lower left to upper right) – Low values on the x-axis correspond to low values on the y-axis and high values on the x-axis correspond to high values on the y-axis • Inverse/negative (Data points move upper left to lower right) – Low values on the x-axis correspond to high values on the y-axis and high values on the x-axis correspond to low values on the y-axis *See example scatterplots/scattergrams on the next two slides Eyeballing relationship in a scatterplot/scattergram

  9. Scatterplot – Motorist speed and police car presence (Refer to plot at top of page 327) – Shows inverse relationship

  10. Scatterplot – iq and pulse rate – shows positive relationship

  11. The relationship between any two variables can be defined (summarized) by a line. (P. 328) • There are two important values associated with a “line”. (P. 330) • Slope • Intercept Properties of regression line

  12. Ŷ =α + βX(p. 330) – Formula that describes “line”(Ŷ = statistician’s symbol for the predicted value of Y and pronounced “Y-hat, p. 331) α= y-intercept or intercept (point where X crosses y-axis; also value of Ŷ when X = 0) β = slope of the line (slant of the line that represents the average change in Y for each one-unit change in X, the independent variable) For this course, you will only need to know the formula for the regression equation, NOT the formula for α or β as shown in the text. Equation for Simple Regression (Concepts found on p. 469. Again assumption is linear relationship.)

  13. Ŷ = a + bX(p. 335) • Formula that describes “line” when using sample data • a= y-intercept or intercept (Same definition as previous slide) • b= slope of the line (Same definition as previous slide) Format for regression equations involving sample data (p. 335)

  14. Ŷ = a + bX(Again this is the general formula when using sample data.) Use the table titled “Coefficients”. The value for a(y-intercept) is located in the (Constant) row under “B” in the Unstandardized Coefficients section of the table. The value for b= (slope of the line) is located in the 2nd (bottom) row under “B” in the Unstandardized Coefficients section of the table. How to write regression equations using spss output

  15. Ŷ = a + bX(This is the general formula when using sample data but substitute actual values for a and b as shown below when writing the equations. Actual values are located in the SPSS output as discussed in the previous slide. The regression equation for predicting motorist speed from number of police cars is: Ŷ = 72.2 + (-2.55) X (Used 4th table on the handout) The regression equation for predicting pulse rate from IQ scores is: Ŷ = 7.714 + 0.897 X (Used 1st table on the handout) *Note that the form for these equations is the same as that for the general formula above. Practice writing regression equations using spss output– based on 4/21/2012 in-class handout

  16. What is the predicted motorist speed when the number of police cars = 6? (X = 6) Ŷ = 72.2 + (-2.55) X = 72.2 + (-2.55) 6 = 72.2 – 15.3 = 56.9 What is the predicted pulse rate when IQ = 91? (X = 91) Ŷ = 7.714 + 0.897 X = 7.714 + 0.897 (91) = 7.714 + 81.627 = 89.341 Practice predicting Ŷ when value of x is given– based on 4/21/2012 in-class handout

  17. COEFFICIENT OF DETERMINATION • R2(For 2 variables, actually r2 for sample data.) Also written R Square in SPSS results. • Defined as: theamount of variation in Y that is accounted for (EXPLAINED) by X. • COEFFICIENT OF NONDETERMINATION • Complement of R2 • Defined as: theamount of variation in Y that is NOT accounted for (UNEXPLAINED) by X. OTHER IMPORTANT REGRESSION CONCEPTS

  18. To identify the amount of variance in motorist speed that is EXPLAINED by the number of police cars, use the table titled “Model Summary” and locate the value associated with R Square. The appropriate data involving these variables is found in the 2nd table. The value for R Square is 0.942 or 94.2% (when converted to a percent). Thus, this value is the coefficient of determination. • To identify the amount of variance in motorist speed that is NOT EXPLAINED by the number of police cars, first you need to know that total variance (EXPLAINED + UNEXPLAINED) is 1.000 or 100%. Since the UNEXPLAINED variance is the complement of R Square, you subtract the coefficient of determination from 1.000 (if using the decimal numbers) or 100% (if using percent), Thus, the coefficient of nondetermination (UNEXPLAINED variance/ variance NOT EXPLAINED) is calculated as follows: 1.000 100.0 % - 0.942 - 94.2% 0.058 5.8% (Compare results in the Model Summary table (2nd table for these variables) to hand calculation in Step 6 on p. 343.) Practice 1: computing coefficients of determination and nondetermination using spss output– based on 4/21/2012 in-class handout

  19. To identify the amount of variance in pulse rate that is EXPLAINED by IQ, use the table titled “Model Summary” and locate the value associated with R Square. The appropriate data involving these variables is found in the 5th table. The value for R Square is 0.614 or 61.4% (when converted to a percent). Thus, this value is the coefficient of determination. • To identify the amount of variance in pulse rate that is NOT EXPLAINED by IQ, again you need to know that total variance (EXPLAINED + UNEXPLAINED) is 1.000 or 100%. Since the UNEXPLAINED variance is the complement of R Square, you subtract the coefficient of determination from 1.000 (if using the decimal numbers) or 100% (if using percent), Thus, the coefficient of nondetermination (UNEXPLAINED variance/ variance NOT EXPLAINED) is calculated as follows: 1.000 100.0 % - 0.614 - 61.4% 0.38638.6% Practice 2: computing coefficients of determination and nondetermination using spss output– based on 4/21/2012 in-class handout

More Related