1 / 30

Statistics and Research methods

Statistics and Research methods . Wiskunde voor HMI Bijeenkomst 2. Correlation. Association between scores on two variables e.g., age and coordination skills in children, price and quality. Scatter Diagram.

morrison
Download Presentation

Statistics and Research methods

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Statistics and Research methods Wiskunde voor HMI Bijeenkomst 2

  2. Correlation • Association between scores on two variables • e.g., age and coordination skills in children, price and quality

  3. Scatter Diagram • A Scatter Diagram (or scatterplot) is a visual display of the relationship between two variables • Example: A company is interested in whether there is a relationship between the number of employees supervised by a manager and the amount of stress reported by that manager

  4. Stress and Employees Supervised

  5. Cause and Effect • An important type of relationship between two variables: cause and effect • Independent variable = cause • Dependent variable = effect

  6. Correlation and Causality • Three possible directions of causality: 1. X Y 2. X Y 3. Z X Y

  7. Correlation and Causality • In situations where variables cannot be manipulated experimentally, it is difficult to know whether one is actually causing the other • Example in newspaper: “drinking coffee causes cancer” • However, a third variable may cause both high coffee consumption and cancer • Such third variables are called ‘confounds’

  8. However, we can still try to predict one variable on the basis of a second variable, even if the causal relationship has not been determined • Predictor variable • Criterion variable

  9. Scatter Diagrams • The independent (or predictor) variable goes on the horizontal (x) axis; the dependent (or criterion) variable on the vertical (y) axis.

  10. Hours of Overtime Worked and Spouse’s Marital Satisfaction

  11. Patterns of Correlation • Linear correlation • Curvilinear correlation • No correlation • Positive correlation • Negative correlation

  12. Degree of Linear CorrelationThe Correlation Coefficient • Figure correlation using Z scores • Cross-product of Z scores • Multiply score on one variable by score on the other variable • Correlation coefficient • Average of the cross-products of Z scores

  13. Degree of Linear CorrelationThe Correlation Coefficient • Formula for the correlation coefficient: • Positive perfect correlation: r = +1 • No correlation: r = 0 • Negative perfect correlation: r = –1

  14. Correlation and Causality • Correlational research design • Correlation as a statistical procedure • Correlation as a kind of research design

  15. Issues in Interpreting the Correlation Coefficient • Statistical significance e.g. p < .05 • Proportionate reduction in error =Proportion of variance accounted for • r2 • Used to compare correlations

  16. Issues in Interpreting the Correlation Coefficient (continued) • Restriction in range • Unreliability of measurement

  17. Correlation in Research Articles • Scatter diagrams occasionally shown • Correlation matrix

  18. Regression • Making predictions • does knowing a person’s score on one variable allow us to say what their score on a second variable is likely to be? • The method we use to make predictions is called regression • When scores on one variable are used to predict scores on another variable, it is called bivariate regression (two variables) • When scores on two or more variables are used to predict scores on another variable, it is called multiple regression

  19. Naming (two variables)

  20. These two variables correlate positively • People who drink a lot of coffee tend to be happy, and people who do not tend to be unhappy • Preview: The line is called a regression line, and represents the estimated linear relationship between the two variables. Notice that the slope of the line is positive in this example.

  21. The Regression Line • Relation between predictor variable and predicted values of the criterion variable • Formula: Y = a + (b) X • Slope of regression line • Equals b, the raw-score regression coefficient • Intercept of the regression line • Equals a, the regression constant • Method of least squares to derive a andb

  22. Method of least squares • a and b derived by: • least squares method (drawing) • line through MX and MY

  23. The Regression Line Y = a + (b) X

  24. Bivariate Raw Score Prediction • Direct raw-score prediction model • Predicted raw score (on criterion variable) = regression constant plus the result of multiplying a raw-score regression coefficient by the raw score on the predictor variable • Formula • The “hat” over Y means “predicted”

  25. Bivariate prediction with Z scores • Given the Z score for X, what is the Z score for Y? • We use the prediction model: • where b (beta) is the “standardized regression coefficient” • It’s also called “beta weight”, because it tells us how much “weight” to give to ZX when making a prediction for ZY. • The “hat” over ZY means “predicted”.

  26. What is b? • It turns out that the best value to use for b in the prediction model is r, the (Pearson) correlation coefficient • Thus, the bivariate regression model is • When r = 1, ; when r = -1, • When r = 0; no relation; “best guess” for Y is the mean score

  27. Proportionate Reduction in Error • We want a measure of how accurate our regression model (raw score prediction formula) is predicting the data • We can compare the error we make when predicting with our regression model, SSError to the error that we would make if we didn’t have the model SSTotal

  28. Proportionate Reduction in Error • Error • Actual score minus the predicted score • SSError = Sum of squared error using prediction model • SSTotal = Sum of squared error when predicting from the mean =

  29. Error and Proportionate Reduction in Error • Formula for proportionate reduction in error: • Proportionate reduction in error = r2 • Proportion of variance accounted for

More Related