15. Multiple Regression

15. Multiple Regression

How do we actually request the regressions in SPSS? • How do we use regression to explicate a bivariate relationship with a third variable? • What do we look for once we have run the relevant regressions?

Example of Simple and Multiple Regression

DV (Effect) IV (Cause)

SPSS Output:Part 1: First Part Shown Multiple R R Squared = Percent Variance Explained (0.49 × 0.49) Corrects for small n

SPSS Output:Part 2: ANOVA We’ll ignore this part

SPSS Output:Part 3: The Coefficients Almost all of this is important. Here we show one Independent variable.

SPSS Output:Part 3(i): The Coefficents - B • B is shown for each independent variable and the constant. • B for books is the increase in grade when you read one more book • Constant is the estimated grade when you read no (0) books.

Prediction Equation • Estimating the DV • OR:

Add a Line 80 60 40 20 + + + + + Here we can draw the line for the Equation. These are the predicted Values—or best fit line. 0 1 2 3 4

SPSS Output:Part 3: The Coefficients Sig. tests the null hypotheses that B is equal to 0. This is a two-tail test. For directional hypotheses, Divide by 2 to get the sig. level. Two-tail--the B for BOOKs is sig. at the .001 level--about one in 1/000 times would we observe a B as large + or – if there were no relationship Between BOOKS and grades.

Most of these previous 8 slides were adapted from Jeremy Miles notes on line. • Now let’s look at explicating a bivariate relationship with a third variable.

Explicating a bivariate relationship with a third variable A misspecified relationship is when the magnitude or direction of the relationship you observe between a and b is not due to a causing b, but to c partly or wholly causing both a and b. When you control for c the relationship between a and b changes in magnitude or direction.

Suppose we hypothesize that respondent’s affect for Clinton (thermometer score) causes their affect for Gore (thermometer score). • But we wish to consider the alternative explanation that partisanship is a cause of both. By ignoring the effect of partisanship on both we can overestimate the effect of feelings towards Clinton impacting feelings towards Gore

++ C G P + + C G + Here we might find: Here we would have overestimated the impact of Con G. C does cause G, but controlling for P we realize the effect is less than we initially thought.

++ C G

So yes we did overestimate the effect of Clinton on Gore’s thermometer score, but the effect of Clinton on Gore is still quite substantial, and statistically sig. at the .01 level. • The coefficient on Clinton is reduced from .689 to .560. • The first equation: G=.689 C + 17.489 becomes: G= .560 C – 8.575 P + 40.952. • Note: what assumption was I making about party id to have included it in this equation when I used party3? (R=3, I=2, D=1). • What would you predict G to be for a Dem who rated Clinton at 60?

G= .560 C – 8.575 P + 40.952. • What would you predict G to be for a Dem (P=1) who rated Clinton at 60? • G=.560 * 60 – 8.575 * 1 + 40.952. • G=66 • For an Independent, G=57 • For a Republican, G=49

P a C Pty Pty Gore Clinton Now we might also have started by examining the effect of partisanship on Gore’s thermometer score and then asking whether Clinton’s score was an intervening variable. G P causes G. All or some of the way Pcauses G is through C. Gore

Most, but not all, of the impact of party on Gore’s thermometer score is due to Clinton’s score. Perception of Clinton mostly explains the way in which party affects perception of Gore • Remember party is still the cause, we are looking at the mechanism.

Now there is a danger that there is a reciprocal relationship. Perhaps Gore also causes perception of Clinton. We are assuming that perception of Clinton is more important and dominant in this relationship. A simple correlation doesn’t give us the answer—we are making an assumption. This we don’t think this: C G But rather this: C G

3D Relationship

3D Linear Relationship

+ E W C E W + Multiple Causes (Enhancement): Two variables may be causes of a third variable, while the two are unrelated to each other. Turning to the legislative data set: Suppose we think that states with higher levels of average education are more likely to elect women to the state legislature either because more women are likely to run or because electorates are more likely to vote for the ones that do. Suppose you also hypothesize that women are more likely to be elected to lower rather than upper chambers. E=% college ed in state; C=chamber (2=upper)(1=lower); W=% women in chamber 0

Now lets look at the correlations among these three variables

o P W S - - P W - Now let’s look at a misspecified relationship: Here we would thought that professionalization (P) had no effect on the percent of women in the chamber (W). But when we control for South (S) we see that there may be an effect of prof that was concealed because of the relationship Southern state region and both P and W.

First I computed a var for southern state:compute south=0.if (state eq 'AL' or state eq 'AR' or state eq 'FL' or state eq 'GA' or state eq 'KY‘ or state eq 'LA' or state eq 'MS' or state eq 'NC' or state eq 'OK' or state eq 'SC' or state eq 'TN' or state eq 'TX' or state eq 'VA')south=1.

S - - P W -

15. Multiple Regression

15. Multiple Regression

Presentation Transcript

Multiple Regression

Multiple Regression

Multiple Regression

Chapter 15 Multiple Regression

Multiple Regression

Multiple Regression

Multiple Regression

Multiple Regression

Multiple Regression

MULTIPLE REGRESSION

Multiple Regression

Multiple regression

Chapter 15 Multiple Regression

Chapter 15: Multiple Linear Regression

Multiple Regression

Multiple Regression

Multiple Regression

Multiple regression:

Multiple Regression

Multiple Regression

Multiple regression

Multiple Regression