170 likes | 186 Views
PSY 307 – Statistics for the Behavioral Sciences. Chapter 7 – Regression. Regression Line. A way of making a somewhat precise prediction based upon the relationships between two variables. Predictor variable & criterion variable
E N D
PSY 307 – Statistics for the Behavioral Sciences Chapter 7 – Regression
Regression Line • A way of making a somewhat precise prediction based upon the relationships between two variables. • Predictor variable & criterion variable • The regression line is placed so that it minimizes the predictive error. • When based upon the squared predictive error the line is called a least squares regression line.
Demo • This demo from the textbook’s student website shows how different lines result in different MSE’s (mean square error): • http://www.ruf.rice.edu/~lane/stat_sim/reg_by_eye/index.html
Least Squares Equation • Y’ = bX + a • To obtain Y’: • Solve for b and a using the data from the correlation analysis • Substitute b and a into the regression equation and solve for Y’. • To find points along the line, substitute X values into the regression equation and calculate Y.
Formula for Regression Line • Solving for b: • Solving for a: • Then insert both into formula: • Y’ = bX + a • Plug in values of X and solve for Y’.
Error Bars show the Standard Error of the Estimate (Regression Line)
Predictive Error for a Value of X X = 50 Y’ = 137 Error of Y’
Standard Error of the Estimate • The average amount of predictive error. • Average amount actual Y values deviate from predicted Y’ values. • No predictive error when r = 1 • Extreme predictive error when r = 0 • Again, formulas vary.
Calculating Predictive Error Definition Formula: Computation Formula:
Kinds of Errors for ALEKS • Difference between the predictions of the regression line and the mean (used as a predictor). • Difference between the predictions of the regression line and the observed values. • Predictive error • The difference between these two kinds of errors.
Z Score Approach • Prediction using Z scores: • Zy = b(Zx) where b = r • b is called the standardized regression coefficient because it is being used for prediction. • Prediction using raw scores: • Change the person’s raw score to a z-score using the z-score formula. • Multiple by b, then change the resulting z-score back to a raw score.
Squared Correlation Coefficient • r2 – the square of the correlation coefficient • Also called coefficient of determination • Measures the proportion of variance of one variable predictable from its relationship with the other variable. • It is the variance of the errors from repetitively predicting the mean, minus error variance using least squares, expressed as a proportion.
Interpretation of r2 • r2 – not r – is the true measure of strength of association and the proportion of a perfect relationship. • Large values of r2 are unusual in behavioral research. • Large values of r2 do not indicate causation. • “Explained variance” refers to predictability not causality.
Regression Toward the Mean • The mean is a statistical default – use the mean to predict when r is 0 or unknown. • Smaller values of r move the prediction toward the mean. • The smaller r is, the greater the predictive error, hedged by moving toward the mean. • Chance results in a regression to the mean with repeated measures.
Regression Fallacy • The statistical regression of extreme values toward the mean occurs due to chance. • Israeli pilots praised for landings do worse on next landing. • It is a mistake (fallacy) to interpret this regression as a real effect. • Praise did not cause the change in landings.
Testing for Regression Fallacy • Divide the group showing regression into two groups: (1) manipulation, (2) control without manipulation. • Underachievers could show improvement due to regression upward to mean. • Always include a control group for regression to the mean.