Multiple Regression • Regression • Attempts to predict one criterion variable using one predictor variable • Addresses the question: Does the predictor significantly predict the criterion?
Multiple Regression • Multiple Regression • Attempts to predict one criterion variable using 2+ predictor variables • Addresses the questions: Do the predictors significantly predict the criterion? If so, which predictor is best? • Allows for variance to be removed from one predictor prior to evaluating the rest (like ANCOVA)
Multiple Regression • How to compare the predictive value of 2+ predictors • When comparing multiple predictors within an experiment • Use standardized b (β) • β = bxs/sintercept • z-score = lets you compare performance between 2 variables with different metrics, by addressing performance relative to a sample mean & SD
Multiple Regression • How to compare the predictive value of 2+ predictors • When comparing multiple predictors between experiments • Use b • SE highly variable between experiments the SE from Exp. 1 ≠ the SE from Exp. 2 β’s from both experiments not comparable • Can’t compare z-score of your Stats grade from this semester with your Stats grade if you take the class again next semester • If next semester’s class is especially dumb, you appear to have gotten much smarter
Multiple Regression • Magnitude of the relationship between one predictor and a criterion (b/β) in a model dependent upon the other predictors in that model • Relationship between IQ and SES (with College GPA and Parents’ SES in the model) will be different if more, less, or different predictors included in the model
Multiple Regression • When comparing the results of 2 experiments using regression, coefficients (b/β) will not be the same • Will be similar to the extent that the regression models are similar • Why not?
Multiple Regression • Coefficients (b/β) represent partial and semipartial (part) correlations, not traditional Pearson’s r • Partial Correlation – the correlation between 2 variables with the variance from one or more variables removed • I.e. correlation between the residuals of both variables, once variance from one or more covariates has been removed
Multiple Regression • Partial Correlation = the amount of the variance in a criterion that is associated with a predictor that could not be explained by the other covariate(s)
Multiple Regression • Semipartial/Part Correlation -the correlation between 2 variables with the variance from one or more variables removed from the predictor only (i.e. not the criterion) • I.e. correlation between the residuals of the predictor, once variance from one or more covariates has been removed, and the criterion
Multiple Regression • Part Correlation = the amount of variance that a predictor explains in a criterion once variance from the covariates has been removed • I.e. the percentage of the total variance left unexplained by the covariate that the predictor accounts for • Since the variance that is removed from the criterion depends on the other predictors in the model, different models yield different regression coefficients
Partial Correlation = B • Part Correlation = B/A + B
Multiple Regression • How to compare the predictive value of 2+ predictors • Remember: Regression coefficients are very unstable from sample to sample, so interpret large differences in coefficients only (> ~.2)
Multiple Regression • Like regression, tests: • Ability of each predictor to predict the criterion variable (tests b’s/β’s) • Overall ability of the model (all predictors combined) to predict the criterion variable (Model R2) • Model R2 = total % variance in criterion accounted for by predictors • Model R = correlation between predictors and criterion • Also can test: • If one or more predictors can predict the criterion if variance from one or more other predictors is removed • If each predictor significantly increases the Model R2
Multiple Regression • Predictors are evaluated with variance from other predictors removed • More than one way to remove this variance • Examine all predictors en masse with variance from all other predictors removed • Remove variance from one or more predictors first, then look at second set • Like in factorial ANCOVA
Multiple Regression • This is done by specifying different selection methods • Selection method = method of inputting predictors into a regression equation • Four most commonly used methods • Commonly-used = Only 4 methods offered by SPSS
Multiple Regression • Selection Methods • Simultaneous – Adds all predictors at once & is therefore the lack of a selection method • Good if there is no theory to guide which predictors should be entered first • But when does this ever happen?
Multiple Regression • Selection Methods • All Subsets – Computer finds method of entering predictors that maximizes overall Model R2 • But SPSS doesn’t do it and it finds best subset in your particular dataset – since data, not theory, guiding selection method not guarantee that the model will generalize to other datasets, particularly in smaller samples
Multiple Regression • Selection Methods • Backward Elimination – Starts will all predictors in the model and eliminates the predictor with least unique variance related to criterion iteratively until all predictors are significant • Iterative = process involving several steps • It begins with all predictors, so predictors with least variance not overlapping with other predictors (i.e. that would be partialled out) are removed • But, also atheoretical/based on data only
Multiple Regression • Selection Methods • Forward Selection – the opposite of backward elimination - starts will the predictor in the model most strongly related to the criterion and adds the predictor next most strongly-related to criterion iteratively until a nonsignificant predictor is found • Step 1: predictor most correlated with the criterion (P1) Step 2: add strongest predictor when P1 partialled out • But also atheoretical
Multiple Regression • Selection Methods • Stepwise • Technically, any selection method that procedes iteratively (in steps) is stepwise (i.e. both backward elimination and forward selection) • However, usually refers to method where order of predictors is determined in advance by the researcher based upon theory
Multiple Regression • Selection Method • Stepwise • Why would you use it? • Same reason as covariates in ANCOVA • Want to know if Measure A of treatment adherence is better than Measure B? Run stepwise regression and enter Measure B first, then Measure A with treatment outcome as the criterion. • Addresses the question: Does Measure A predict treatment outcome even when variance from Measure B has already been removed (i.e. above and beyond Measure B)?
Multiple Regression • Selection Method • Stepwise • Why would you use it? • Running a repeated-measures design and want to make sure your groups are equal on pre-test scores? Enter the pre-test into the first step of your regression.
Multiple Regression • Assumptions • Linearity of Regression • Variables linearly related to one another • Normality in Arrays • Actual values of DV normally distributed around predicted values (i.e. regression line) – AKA regression line is good approximation of population parameter • Homogeneity of Variance in Arrays • Assumes that variance of criterion is equal for all levels of predictor(s)
Multiple Regression • Issues to be aware of: • Range Restriction • Heterogenous Subsamples • Outliers • With multiple predictors, must be aware of both univariate outliers (unusual values on one variable) as well as multivariate outliers (unusual values on two or more variables)
Multiple Regression • Outliers • Univariate outlier – a man weighing 500 lbs. • Multivariate outlier – a man who is 6’ tall and weights 120 lbs. – Note neither value is a univariate outlier, but both together are quite odd • Three variables define the presence of an outlier in multiple regression: • Distance – distance from the regression line • Leverage – distance from predictor mean • Influence – average of distance and leverage
Distance – distance from the regression line • See A • Leverage – distance from predictor mean • See B • Influence – average of distance and leverage
Multiple Regression • Degree of Overlap in Predictors • Adding predictors is like adding covariates in ANCOVA: In adding one that correlates too highly with others, model R2 remains unchanged but df decreases, making the regression less powerful • Tolerance = multiple R2 between all predictors – want to be low • Examine bivariate correlations between predictors, if correlation exceeds internal consistency (α), get rid of one of them
Multiple Regression • Multiple regression can also test for more complex relationships, such as mediation and moderation • Mediation – when one variable (predictor) operates on another variable (criterion) via a third variable (mediator)
Math self-efficacy mediates math ability and interest in a math major • Must establish paths A & B, and that path C is smaller when paths A & B are included in the model (i.e. math self-efficacy accounts for variance in interest in a math major above and beyond math ability)
Find significant correlations between the predictor and mediator (path A) and mediator and criterion (path B) • Run a stepwise regression with the predictor entered first, then the predictor and mediator entered together in step 2
Multiple Regression • The mediator should be a significant predictor of the criterion in step 2 • The predictor-criterion relationship (b/β) should decrease from step 1 to step 2 • Full mediation: If this relationship is significant in step 1, but nonsignificant in step 2 • Partial mediation: This relationship is significant in step 1, and smaller, but still significant, in step 2
Multiple Regression • Partial mediation • Sobel’s test (1982): tests the statistical significance of this mediation relationship • Regress predictor on mediator (path A) and mediator on criterion (path B) in 2 separate regressions • Calculate sβ for path A & B, where sβ = β/t • Calculate a t-statistic, where df = n – 3 and
Multiple Regression • Multiple regression can also test for more complex relationships, such as mediation and moderation • Moderation (in regression) – when the strength of a predictor-criterion changes as a result of a third variable (moderator) • Interaction (in ANOVA) – when the strength of the relationship between an IV and DV changes as a function of levels of the IV
Multiple Regression • Moderation • Unlike in ANOVA, you have to create a moderator term for yourself by multiplying the predictor and moderator • In SPSS, go to Transform Compute • Typical to enter the predictor and mediator in the first step of a regression and the interaction term in the second step to determine the contribution of the mediator above and beyond the main effect terms • Just like how variance is partitioned in a factorial ANOVA
Logistic Regression • Logistic Regression = used to predict a dichotomous criterion (only 2 levels) variable with 1+ continuous or discrete predictors • Can’t use linear regression with a dichotomous criterion because: • Dichtotomous = assuming the criterion isn’t normally distributed (i.e. assumption of normality in arrays is violated)
Can’t use linear regression with a dichotomous criterion because: • Regression line fits data more poorly when predictor = 0 (i.e. assumption of homogeneity of variance arrays is violated)
Logistic Regression • Logistic Regression • Interpreting coefficients • In logistic regression, b represents change in log odds in criterion with one point increase in predictor • Raise “ex” where x = b, to find the odds – b = -.0812 e-.0812 = .9220
Logistic Regression • Logistic Regression • Interpreting coefficients • Continuous predictor: One pt. increase in predictor corresponds to decreasing (because b is neg) odds of criterion by factor of .922 (almost 100% or twice as likely) • Dichotomous predictor: Odds of change in one group vs. other group (sign indicates increase or decrease)