720 likes | 862 Views
Psych 5510/6510. Chapter 10 . Interactions and Polynomial Regression: Models with Products of Continuous Predictors. Spring, 2009. Broadening the Scope.
E N D
Psych 5510/6510 Chapter 10. Interactions and Polynomial Regression: Models with Products of Continuous Predictors Spring, 2009
Broadening the Scope So far we have been limiting our models by ignoring the possibility that the predictor variables might interact, and by using only straight lines for our regression (i.e. ‘linear’ regression). This chapter provides an approach that allows us to add both the interaction of variables and nonlinear regression to our models.
Our ‘Running’ Example Throughout this chapter we will be working with the following example: Y is the time (in minutes) taken to run a 5-kilometer race. X1 is the age of the runner X2 is how many miles per week the runner ran when in training for the race.
‘On Your Mark’ We will begin by taking another perspective of what we have been doing so far in the text, and then use that perspective to understand interactions and nonlinear regression.
Time and Age The analysis of the data leads to the following ‘simple’ relationship between Time (Y) and Age (X1). MODEL C: Ŷi=β0 MODEL A: Ŷi=β0+β1X1i Ŷi=15.104 + .213X1i PRE=.218. F*=21.7, p<.01
Time and Miles The simple relationship between Time (Y) and Miles of Training (X2). MODEL C: Ŷi=β0 MODEL A: Ŷi=β0+β2X2i Ŷi=31.91 - .280X2i PRE=.535. F*=89.6, p<.01
Both Predictors Now regress Y, on both Age (X1), and Miles of Training (X2). MODEL C: Ŷi=β0 MODEL A: Ŷi=β0 +β1X1i+β2X2i Ŷi=24.716 + 1.65X1i - .258X2i PRE=.662. F*=75.55, p<.01
‘Get Set’ Now we will develop another way to think about multiple regression, one that re-expresses multiple regression in the form of a simple regression. We will start with the Age (X1). The simple regression of Y on X1 has this form: Ŷi=(intercept) + (slope)X1i
The multiple regression model is: Ŷi=24.716 + 1.65X1i - .258X2i We can make the multiple regression model fit the simple regression form: Ŷi= (intercept) + (slope)X1i Ŷi= (24.716 - .258X2i) + (1.65)X1i When X2=10, then Ŷi= (22.136) + (1.65)X1i When X2=30, then Ŷi= (16.976) + (1.65)X1i From this it is clear that the value of X2 can be thought of as changing the intercept of the simple regression of Y on X1, without changing its slope.
The simple relationship of Time (Y) and Age (X1) at various levels of Training Miles (X2)
Of course we can also work the other direction, and change the multiple regression formula to examine the simple regression of Time (Y) on Miles of Training (X2)
The multiple regression model is: Ŷi=24.716 + 1.65X1i - .258X2i We can make the multiple regression model fit the simple regression form: Ŷi= (intercept) + (slope)X2i Ŷi= (24.716 +1.65 X1i) + (-.258)X2i When X1=20, then Ŷi= (57.716) + (-.258)X2i When X1=60, then Ŷi= (123.72) + (-.258)X2i From this it is clear that the value of X1 can be thought of as changing the intercept of the simple regression of Y on X2, without changing its slope.
The simple relationship of Time (Y) and Training Miles (X2) at various levels of Age (X1)
Additive Model When we look at these simplified models it is clear that the effect of one variable gets added to the effect of the other, moving the line up or down the Y axis but not changing the slope. This is known as the ‘additive model’.
Interactions Between Predictor Variables Let’s take a look at a non-additive model. In this case, we raise the possibility that the relationship between age (X1) and time (Y) may differ across levels of the other predictor variable miles of training (X2). To say that the relationship between X1 and Y may differ across levels of X2 is to say that the slope of the regression line of Y on X1 may differ across levels of X2.
Non-Additive Relationship Between X1 and X2 The slope of the relationship between age and time is less for runners who trained a lot than for those trained less.
Interaction=Non-Additive Predictor variables interact when the value of one variable influences the relationship (i.e. slope) between the other predictor variables and Y.
Interaction & Redundancy Whether or not there is an interaction between two variables in predicting a third is an issue that is totally independent of whether or not the two predictor variables are redundant with each other. Expunge from your mind any connection between these two issues (if it was there in the first place).
Adding Interaction to the Model To add an interaction between variables to the model, simply add a new variable that is the product of the other two (i.e. create a new variable whose values are the score on X1 times the score on X2), then do a linear regression on that new model: Ŷi=β0 +β1X1i+β2X2i +β3(X1iX2i) Ŷi=19.20+.302X1i+(-.076)X2i +(-.005)(X1iX2i)
Testing Significance of the Interaction Test significance as you always do using the model comparison approach. First, to test the overall model that includes the interaction term: Model C: Ŷi=β0 Model A: Ŷi=β0 +β1X1i+β2X2i +β3(X1iX2i) H0: β1 = β2 = β3 =0 HA: at least one of those betas is not zero.
Testing Significance of the Interaction Second, to test whether adding the interaction term is worthwhile compared to a purely additive model: Model C: Ŷi= β0 +β1X1i+β2X2i Model A: Ŷi= β0 +β1X1i+β2X2i +β3(X1iX2i) H0: β3=0 HA: β30 The test of the partial regression coefficient gives you: PRE=.055, PC=3, PA=4, F*=4.4, p=.039
Understanding the Interaction of Predictor Variables To develop an understanding of the interaction of predictor variables, we will once again take the full model: Ŷi=β0 +β1X1i+β2X2i +β3(X1iX2i) And translate it into the form of the simple relationship of one predictor variable (X1) and Y: Ŷi=(intercept) + (slope)X1i
‘Go” Full model: Ŷi=β0 +β1X1i+β2X2i +β3(X1iX2i) = Ŷi=β0 +β2X2i +β1X1i +β3(X1iX2i) = Ŷi=β0 +β2X2i+(β1+β3X2i )X1i Simple relationship of Y (time) and X1 (age): Ŷi= (intercept) + (slope)X1i Ŷi= (β0 +β2X2i) + (β1+β3X2i )X1i
Simple Relationship of Y (Time) and X1 (Age) Ŷi= (intercept) + (slope)X1i Ŷi= (β0 +β2X2i) + (β1+β3X2i )X1i It is clear in examining the relationship between X1 and Y, that the value of X2 influences both the intercept and the slope of that relationship.
Simple Relationship of Time and Age (cont.) Ŷi= (intercept) + (slope)X1i Ŷi= (β0 +β2X2i) + (β1+β3X2i )X1i b0=19.20b1=.302 b2=-.076 b3=-.005 Ŷi=(19.20+-.076X2i) + (.302+-.005X2i )X1i When X2 (i.e. miles) =10, then Ŷi=18.44 + .252X1i When X2 (i.e. miles) =50, then Ŷi=15.4 + .052X1i
Simple Relationship of Y (Time) and X2 (Miles) Full model: Ŷi=β0 +β1X1i+β2X2i +β3(X1iX2i) = Ŷi= β0 +β1X1i+(β2+β3X1i )X2i Simple relationship of Y (time) and X2 (miles): Ŷi=(intercept) + (slope)X2i Ŷi=(β0 +β1X1i) + (β2+β3X1i )X2i
Simple Relationship of Time and Miles (cont.) Ŷi= (intercept) + (slope)X2i Ŷi= (β0 +β1X1i) + (β2+β3X1i )X2i b0=19.20b1=.302 b2=-.076 b3=-.005 Ŷi=(19.20+.302X1i) + (-.076+-.005X2i )X1i When X1 (i.e. age) =60, then Ŷi=37.32 - .376X2i When X1 (i.e. age) =20, then Ŷi=25.24 –.176X2i
Back to the Analysis We’ve already looked at how you test to see if it is worthwhile to move from the additive model to the interactive model: Model C: Ŷi= β0 +β1X1i+β2X2i Model A: Ŷi=β0 +β1X1i+β2X2i +β3(X1iX2i) H0: β3=0 HA: β30 The next topic involves the interpretation of the partial regression coefficients.
Interpreting Partial Regression Coefficients Ŷi= β0 +β1X1i+β2X2i Additive model: we’ve covered this in previous chapters. The values of β1 andβ2 are the slopes of the regression of Y on that variable when the other variable is held constant (i.e. the slope across values of the other variable). Look back at the scatterplots for the additive model, β1 is the slope of the relationship between Y and X1 across various values of X2, note that the slope doesn’t change.
Interpreting Partial Regression Coefficients Ŷi=β0 +β1X1i+β2X2i +β3(X1iX2i) Interactive model: when X1 and X2 interact, then the slope of the relationship between Y and X1 changes across values of X2 so what does β1 reflect? Answer: β1 is the slope of the relationship between Y and X1 when X2=0. Note: the slope will be different for other values of X2. Likewise: β2 is the slope of the relationship between Y and X2 when X1=0.
Interpreting β1 and β2 (cont.) So, β1 is the slope of the regression of Y on X1 when X2=0, or in other words, the slope of the regression of Time on Age for runners who trained 0 miles per week (even though none of our runners trained that little). β2 is the slope of the regression of Y on X2 when X1=0, or in other words, the slope of the regression of Time on Miles for runners who are 0 years old! This is not what we are interested in!
Better Alternative A better alternative for when scores of zero in our predictor variables are not of interest, is to use mean deviation scores instead (this is called ‘centering’ our data): Then regress Y on X’1 and X’2 Ŷi=β0 +β1X’1i+β2X’2i +β3(X’1iX’2i)
Interpreting β1 and β2 Now So, β1 is still the slope of the regression of Y on X1 when X’2=0, but now X’2=0 when X2=the mean of X2, which is much more relevant, we now have the relationship between Time and Age for runners who trained an average amount. β2 is the slope of the regression of Y on X2 when X’1=0, but now X’1=0 when X1=the mean of X1, i.e., we now have the relationship between Time and Miles for runners who were at the average age of our sample.
Interpreting β0 For the model: Ŷi=β0 +β1X1i+β2X2i +β3(X1iX2i) β0 is the value of Y when all the predictor scores equal zero (rarely of interest) For the model: Ŷi=β0 +β1X’1i+β2X’2i +β3(X’1iX’2i) β0 = μY (due to the use of mean deviation scores) and the confidence interval for β0 is thus the confidence interval for μY
Interpreting β3 Ŷi=β0 +β1X1i+β2X2i +β3(X1iX2i) β3 represents how much the slope changes in one variable as the other variable changes by 1. It is not influenced by whether you use X1 or X’ 1, or X2 or X’2. So β3 would be the same in both of the following models: Ŷi=β0 +β1X1i+β2X2i +β3(X1iX2i) Ŷi=β0 +β1X’1i+β2X’2i +β3(X’1iX’2i) But the values of β0 , β1 and β2 would be different in the two models.
Interpreting β3 (cont.) Important note:β3 represents the interaction of X1 and X2only when both of those variables are included by themselves in the model. For example, in the following model β3 would not represent the interaction of X1 and X2 because β2X2i is not included in the model: Ŷi=β0 +β1X1i+β3(X1iX2i)
Other Transformations As we have seen, using X’=(X-mean of X) allows us to have meaningful β’s, as the partial regression coefficient is the simple relationship of the corresponding variable when the other variable equals its mean. We can use other transformations. X1i”=(X1i-50) allows us to look at the simple relationship between miles (X2) and time (Y) when age (X1)=50.
Regular model: Ŷi=β0 +β1X1i+β2X2i +β3(X1iX2i) Model with transformed X1i Ŷi=β0 +β1X”1i+β2X2i +β3(X’’1iX2i) Transforming the X1i score to X”1i will: • Affect the value of β2 (as it now gives the slope for the relationship between X2 and Y when X1=50). • Will not affect B1 (the slope of the relationship between X1 and Y when X2=0). • Will not affect B3 (the slope of the interaction term is not affected by transformations of its components as long as all components are included in the model).
Power Considerations The confidence interval formula is the same for all partial regression coefficients, whether of interactive terms or not:
Power Considerations Smaller confidence intervals mean more power: • Smaller MSE (i.e. error in the model) means more power. • Larger tolerance (1-R²) means more power.
Power, Transformations, and Redundancy If you use transformed scores (e.g. mean deviations) then it can affect the redundancy of the interaction term with its component terms (which should then affect the confidence intervals and thus affect power) but this change in redundancy is completely counterbalanced by changes in MSE. Thus using transformed scores will not affect the confidence intervals or power. So...
The Point Being... If your stat package won’t let you include an interaction term because it is too redundant with its component terms (i.e. its tolerance is too low) then you can try using mean deviation component terms (which will change the redundancy of the interaction term with it components without altering the confidence interval of the interaction term).
Polynomial (Non-linear)Regressions What we have learned about how to examine the interaction of variables also provides exactly what we need to see if there might be non-linear relationships between the predictor variables and the criterion variable (Y).
Polynomial (Non-linear)Regressions Let’s say we suspect that the relationship between Time and Miles is not the same across all levels of Miles. In other words, adding 5 more miles per week of training when you are currently at 10 miles per week, will have a different effect than adding 5 more miles when you are currently training at 50 miles per week. To say that Miles+5 has a different effect when Miles=10 then when Miles=50 is to say that the slope is different at 10 than at 50.
X2 Interacting With Itself In essence we are saying that X2 is interacting with itself. Previous model: Ŷi=β0 +β1X1i+β2X2i +β3(X1iX2i) This model (ignore X1 and use X2 twice) Ŷi=β0 +β1X2i+β2X2i +β3(X2iX2i)
Interaction Model Ŷi=β0 +β1X2i+β2X2i +β3(X2iX2i), or, Ŷi=β0 +β1X2i+β2X2i +β3(X2i²) However, we cannot calculate the b’s because the variables that go with β1 and β2 are completely redundant (they are the same variable, thus tolerance =0), so we drop one of them (which makes conceptual sense in terms of model building), and get: Ŷi=β0 +β1X2i+β2(X2i²)