350 likes | 464 Views
Statistics for the Social Sciences. Prediction with multiple variables. Psychology 340 Spring 2010. Outline. Multiple regression Comparing models, Delta r 2 Using SPSS. Multiple Regression. Typically researchers are interested in predicting with more than one explanatory variable
E N D
Statistics for the Social Sciences Prediction with multiple variables Psychology 340 Spring 2010
Outline • Multiple regression • Comparing models, Delta r2 • Using SPSS
Multiple Regression • Typically researchers are interested in predicting with more than one explanatory variable • In multiple regression, an additional predictor variable (or set of variables) is used to predict the residuals left over from the first predictor.
Multiple Regression • Bi-variate regression prediction models Y = intercept + slope (X) + error
“residual” “fit” Multiple Regression • Multiple regression prediction models • Bi-variate regression prediction models Y = intercept + slope (X) + error
whatever variability is left over First Explanatory Variable Second Explanatory Variable Third Explanatory Variable Fourth Explanatory Variable Multiple Regression • Multiple regression prediction models
whatever variability is left over First Explanatory Variable Second Explanatory Variable Third Explanatory Variable Fourth Explanatory Variable Multiple Regression • Predict test performance based on: • Study time • Test time • What you eat for breakfast • Hours of sleep
versus versus Multiple Regression • Predict test performance based on: • Study time • Test time • What you eat for breakfast • Hours of sleep • Typically your analysis consists of testing multiple regression models to see which “fits” best (comparing r2s of the models) • For example:
Response variable Total variability it test performance Total study time r = .6 Multiple Regression Model #1: Some co-variance between the two variables • If we know the total study time, we can predict 36% of the variance in testperformance R2 for Model = .36 64% variance unexplained
Multiple Regression Model #2: Add test time to the model • Little co-variance between these test performance and test time • We can explain more the of variance in test performance R2 for Model = .49 Response variable Total variability it test performance Total study time r = .6 51% variance unexplained Test time r = .1
Multiple Regression Model #3: No co-variance between these test performance and breakfast food • Not related, so we can NOT explain more the of variance in test performance R2 for Model = .49 Response variable Total variability it test performance breakfast r = .0 Total study time r = .6 51% variance unexplained Test time r = .1
Multiple Regression Model #4: Some co-variance between these test performance and hours of sleep • We can explain more the of variance • But notice what happens with the overlap (covariation between explanatory variables), can’t just add r’s or r2’s R2 for Model = .60 Response variable Total variability it test performance breakfast r = .0 Total study time r = .6 40% variance unexplained Hrs of sleep r = .45 Test time r = .1
Multiple Regression in SPSS Setup as before: Variables (explanatory and response) are entered into columns • A couple of different ways to use SPSS to compare different models
Regression in SPSS • Analyze: Regression, Linear
Predicted (criterion) variable into Dependent Variable field • All of the predictor variables into the Independent Variable field Multiple Regression in SPSS • Method 1:enter all the explanatory variables together • Enter:
Multiple Regression in SPSS • The variables in the model • r for the entire model • r2 for the entire model • Unstandardized coefficients • Coefficient for var1 (var name) • Coefficient for var2 (var name)
Coefficient for var1 (var name) • Coefficient for var2 (var name) Multiple Regression in SPSS • The variables in the model • r for the entire model • r2 for the entire model • Standardized coefficients
Multiple Regression • Which β to use, standardized or unstandardized? • Unstandardized β’s are easier to use if you want to predict a raw score based on raw scores (no z-scores needed). • Standardized β’s are nice to directly compare which variable is most “important” in the equation
First Predictor variable into the Independent Variable field • Click the Next button Multiple Regression in SPSS • Method 2: enter first model, then add another variable for second model, etc. • Enter: • Predicted (criterion) variable into Dependent Variable field
Second Predictor variable into the Independent Variable field • Click Statistics Multiple Regression in SPSS • Method 2 cont: • Enter:
Multiple Regression in SPSS • Click the ‘R squared change’ box
Multiple Regression in SPSS • Shows the results of two models • The variables in the first model (math SAT) • The variables in the second model (math and verbal SAT)
Multiple Regression in SPSS • Shows the results of two models • The variables in the first model (math SAT) • The variables in the second model (math and verbal SAT) • r2 for the first model • Model 1 • Coefficients for var1 (var name)
Coefficients for var1 (var name) • Coefficients for var2 (var name) Multiple Regression in SPSS • Shows the results of two models • The variables in the first model (math SAT) • The variables in the second model (math and verbal SAT) • r2 for the second model • Model 2
Multiple Regression in SPSS • Shows the results of two models • The variables in the first model (math SAT) • The variables in the second model (math and verbal SAT) • Change statistics: is the change in r2 from Model 1 to Model 2 statistically significant?
“residual” “fit” Hypothesis testing with Regression • Multiple Regression • We can test hypotheses about the overall model
Multiple Regression in SPSS • Null Hypotheses • H0: University GPA is not predicted by SAT verbal or SAT Math scores • p < 0.05, so reject H0, SAT math and verbal predict University GPA
First Explanatory Variable Second Explanatory Variable Third Explanatory Variable Fourth Explanatory Variable Hypothesis testing with Regression • Multiple Regression • We can test hypotheses about each of these explanatory hypotheses within a regression model • So it’ll tell us whether that variable is explaining a “significant”amount of the variance in the response variable • We can test hypotheses about the overall model
H0: Coefficient for var1 = 0 • p < 0.05, so reject H0, var1 is a significant predictor • H0: Coefficient for var2 = 0 • p > 0.05, so fail to reject H0, var2 is a not a significant predictor Multiple Regression in SPSS • Null Hypotheses
Hypothesis testing with Regression • Multiple Regression • We can test hypotheses about each of these explanatory hypotheses within a regression model • So it’ll tell us whether that variable is explaining a “significant”amount of the variance in the response variable • We can test hypotheses about the overall model • We can also use hypothesis testing to examine if the change in r2 is statistically significant
Hypothesis testing with Regression • Shows the results of two models • The variables in the first model (math SAT) • The variables in the second model (math and verbal SAT) • r2 for the first model • Model 1 • Coefficients for var1 (var name)
Coefficients for var1 (var name) • Coefficients for var2 (var name) Hypothesis testing with Regression • Shows the results of two models • The variables in the first model (math SAT) • The variables in the second model (math and verbal SAT) • r2 for the second model • Model 2
The 0.002 change in r2 is not statistically significant (p = 0.46) Hypothesis testing with Regression • Shows the results of two models • The variables in the first model (math SAT) • The variables in the second model (math and verbal SAT) • Change statistics: is the change in r2 from Model 1 to Model 2 statistically significant?
Regression in Research Articles • Bivariate prediction models rarely reported • Multiple regression results commonly reported
Cautions in Multiple Regression • We can use as many predictors as we wish but we should be careful not to use more predictors than is warranted. • Simpler models are more likely to generalize to other samples. • If you use as many predictors as you have participants in your study, you can predict 100% of the variance. Although this may seem like a good thing, it is unlikely that your results would generalize to any other sample and thus they are not valid. • You probably should have at least 10 participants per predictor variable (and probably should aim for about 30).