330 likes | 466 Views
Curvilinear 2. Modeling Departures from the Straight Line (Curves and Interactions). How does polynomial regression test for quadratic and cubic trends?
E N D
Curvilinear 2 Modeling Departures from the Straight Line (Curves and Interactions)
How does polynomial regression test for quadratic and cubic trends? What are orthogonal polynomials? When can they be used? Describe an advantage of using orthogonal polynomials over simple polynomial regression. Suppose we have one IV and we analyze this IV twice, once thru linear regression and once as a categorical variable. What does the test for the difference in R-square between the two tell us? What doesn’t it tell us, that is, if the result is significant, what is left to do? Skill Set
Why is collinearity likely to be a problem in using polynomial regression? Describe the sequence of tests used to model curves in polynomial regression. How do you model interactions of continuous variables with regression? What is the difference between a moderator and a mediator? How do you test for the presence of each? More skills
Nonlinear Trends in Experimental Research • Suppose we go to Bush Gardens and Measure reactions to a roller coaster as a function of time. • We ask for excitement ratings (1 to 10 scale) either immediately after the ride or at 5, 10 or 15 minutes after.
Roller Coaster Ratings Excitement as a function of time. Note that IV is represented in 2 ways (a) as a continuous IV, and (b) as a dummy coded (3 vector) categorical IV.
SAS boxplots of roller coaster data.
Roller coaster data analysis with time as continuous IV R-square
Testing for curves Compare the R-square values. Linear: .51987. Categorical: .5892. Critical value (alpha = .05) of F(2,16) = 3.63, n.s. If significant, the F test indicates departure from linearity, but not where or how. If you have M levels, can have up to (M-1) bends.
Review Suppose we have one IV and we analyze this IV twice, once thru linear regression and once as a categorical variable. What does the test for the difference in R-square between the two tell us? What doesn’t it tell us, that is, if the result is significant, what is left to do?
Orthogonal Polynomials Sometimes orthogonal polynomials can be used to analyze experimental data to test for curves. Two restrictive assumptions must be met to use orthogonal polynomials: (1) equal ‘spacings’ of the IV, and (2) equal numbers of observations (people) at each cell (e.g., coaster data). Orthogonal polynomials are special sets of coefficients that test for bends but manage to remain uncorrelated with one another. This gives them an advantage in statistical power and in simplicity of understanding.
Coaster data with orthogonal polynomial vectors. Note the pattern in the vectors. Switches indicate bends. Find orthogonal polynomials in a table.
Polynomial X=1 2 3 4 Linear -1 0 1 Quadratic 1 -2 1 Linear -3 -1 1 3 Quadratic 1 -1 -1 1 Cubic -1 3 -3 1 Orthogonal Poly Table Note. Rows in table will be columns in data. Cols in table represent levels of IV.
R Time Excite (Rating) L Q C Time 1 Excite -.72 1 Linear 1.00 -.72 1 Quad .00 .25 .00 1 Cubic .00 .08 .00 .00 1 Correlations Among Vectors These are correlations among the vectors for the coaster data. Time in minutes since leaving the coaster correlated -.72 with excitement ratings. Time correlates 1.0 with the linear trend. Note that the Linear, Quadratic and Cubic vectors are uncorrelated (orthogonal).
Source df Estimate Type I & Type III SS F P Intercept 7.55 Linear 1 -.45 20.25 20.25 .0004 Quad 1 .35 2.45 2.45 .1371 Cubic 1 .05 0.25 0.25 .6239 Regression with Orthogonal Polynomials Note that R-square for the model using orthogonal polynomials is the same as that using the dummy vectors. The F for the linear component is larger using orthogonal polynomials than it was for the linear regression because the error term is smaller due to the quadratic and cubic terms. Orthogonal polynomials provide a powerful test of effects. Also can be used to graph results to show bends.
Review • How does polynomial regression test for quadratic and cubic trends? • What are orthogonal polynomials? When can they be used?
Nonlinear Relations in Nonexperimental Research • Create power terms (IV taken to successive powers) • Test for increasing numbers of bends by adding terms • Quit when adding a term does not increase variance accounted for.
Rating (DV) Time Time**2 Time**3 10 0 0 0 9 0 0 0 10 0 0 0 8 0 0 0 9 0 0 0 8 5 25 125 7 5 25 125 7 5 25 125 8 5 25 125 9 5 25 125 7 10 100 1000 6 10 100 1000 8 10 100 1000 5 10 100 1000 7 10 100 1000 5 15 225 3375 6 15 225 3375 7 15 225 3375 7 15 225 3375 8 15 225 3375 Polynomials to model bends in nonexperimental research
Excite Time Time**2 Time**3 Excite 1 Time -.72 1 Time**2 -.62 .96 1 Time**3 -.55 .91 .99 1 Correlations among terms Note that terms with higher exponents are VERY highly correlated. There WILL be problems with collinearity. Sequence of tests. Start with time, add time squared. If significant, add time cubed. Stop when adding a term doesn’t help. Each power adds a bend. Quadratic is one bend, cubic is two, and so forth.
Model Intercept Excite b1 Time b2 b3 Time**2 R2 Time**3 R2 Ch Excite 1 1 Time 8.90 -.18 .52 .52 Time -.72 1 2 Time, Time2 9.25 -.39 .014 .58 .06 Time**2 -.62 .96 1 3 Time, Time2, Time3 9.20 -.23 -.02 .001 .59 .01 Time**3 -.55 .91 .99 1 Results of Polynomial Regression Note that polynomial is a special case of hierarchical reg.
Polynomial Results (2) Suppose it had happened that the term for time-squared had been significant. The regression equation is Y' = 9.25 -.39X + .014X2. The results graphed:
Interpreting Weights in Polynomial Regression • All power terms for an IV work together to define the curve relating Y to X. • Do not interpret b weights for polynomials. They change if you subtract the mean from the raw data. • To estimate ‘importance’ look to the change in R-square for the block of variables that represent the IV. • Never use polynomials in a variable selection algorithm (e.g., stepwise regression). • Specialized literature on nonlinear terms in path analysis and SEM (hard to do).
Review Describe an advantage of using orthogonal polynomials over simple polynomial regression. • Why is collinearity likely to be a problem in using polynomial regression? • Describe the sequence of tests used to model curves in polynomial regression.
Interactions • An interaction means that the ‘importance’ of one variable depends upon the value of another. • An interaction is also sometimes called a moderator, as in “Z moderates the relations between X and Y.” • In regression, we look to see if the slope relating the DV to the IV changes depending on the value of a second IV.
Example Interaction For those with low cog ability, there is a small correlation between creativity and productivity. As cognitive ability increases, the relations between creativity and productivity become stronger. The slope of productivity on creativity depends on cog ability.
Interaction Response Surface The slope of X1 depends on the value of X2 and vice versa. Regression is looking to fit this response surface and no other when we do the customary analysis for interactions with continuous IVs. More restrictive than ANOVA.
Significance Tests for Interactions • Subtract means from each IV (optional). • Compute product of IVs. • Compute significance of change in R-square using interaction(s). • If R-square change is n.s., no interaction(s) present. • If R-square change is significant, find the significant interaction(s). • Graph the interaction(s)
data d1; input person product create cog; inter=create*cog; cards; 1 50 40 100 2 35 45 80 3 40 50 90 4 50 55 105 5 55 60 110 6 35 40 95 7 45 45 100 8 55 50 105 9 50 55 95 10 40 60 90 11 45 40 110 12 50 45 115 13 60 50 120 14 65 55 125 15 55 60 105 16 50 40 110 17 55 45 95 18 55 50 115 19 60 60 120 20 65 65 140 procprint; proccorr; procglm; model product = create cog; run; Data to test for interaction between cognitive ability and creativity on performance.
Correlation Matrix Pearson Correlation Coefficients, N = 20 Prob > |r| under H0: Rho=0 person product create cog inter person 1.00000 0.65629 0.32531 0.66705 0.57538 0.0017 0.1616 0.0013 0.0079 product 0.65629 1.00000 0.50470 0.83568 0.78465 0.0017 0.0232 <.0001 <.0001 create 0.32531 0.50470 1.00000 0.38414 0.84954 0.1616 0.0232 0.0945 <.0001 cog 0.66705 0.83568 0.38414 1.00000 0.80732 0.0013 <.0001 0.0945 <.0001 inter 0.57538 0.78465 0.84954 0.80732 1.00000 0.0079 <.0001 <.0001 <.0001
Moderator and Mediator • Moderator Means Interaction. Slope of one depends on the value of the other. Use moderated regression (test for an interaction) to test. • Mediator means there is a causal chain of events. The mediating variable is the proximal cause of the DV. A more distal cause changes the mediator. Use path analysis to test. In this graph, X2 is the mediator.
Review • How do you model interactions of continuous variables with regression? • What is the difference between a moderator and a mediator? How do you test for the presence of each?