270 likes | 416 Views
Polynomial models (NKNW 7.7). Form of a polynomial model:Yhat = bo b1X1 b2X12 b2X12 b3X13
E N D
1. Sociology 602 (Martin)Week 10, April 9, 2002 Adding polynomial variables to a model NKNW 7.7
Adding interaction terms to a model NKNW 7.8
Criteria for building a model
general approaches NKNW 8.1
all-possible regressions procedures for NKNW 8.3
model building
forward stepwise regression NKNW 8.4
2. Polynomial models (NKNW 7.7) Form of a polynomial model:
Yhat = bo + b1X1 + b2X12 + b2X12 + b3X13 + …
When to use a polynomial model:
when you have a theoretical presumption that the response function is a polynomial function
(example: the distance an object falls as a function of time).
when the response function is complex or unknown, but it fits pretty well to a polynomial function.
(example: death rates as a function of age)
3. Second-order polynomials Second-order polynomials are the commonest kind; they include only a squared term:
Yhat = bo + b1X1 + b2X12
Example: girls’ height in inches as a function of age, for ages 2-12
Yhat = 20 + 3*X1 – 0.2X12
Predict the height of a girl at age 2, 5, 8, and 11.
4. Graphing second-order polynomials Second-order polynomials always reflect a response function with a single curve.
The linear (first order) term describes the general trend as upward or downward, for values of X near 0.
The squared (second-order) term describes the curvature as upward or downward.
Sketch these examples
Yhat = 20 + 3*X1 + 0.2X12
Yhat = 20 + 3*X1 – 0.2X12
Yhat = 20 – 3*X1 + 0.2X12
Yhat = 20 – 3*X1 – 0.2X12
5. Graphing higher-order polynomials Third-order polynomials describe response functions where the curvature changes over time.
Yhat = 20 + 3*X1 - 0.2X12 + 0.07X13
Higher-order polynomials describe response functions where the change in the curvature changes over time, as in bimodal distributions.
Yhat = 20 + 3*X1 - 0.2X12 + 0.07X13 - 0.01X14
You rarely see high order polynomials in social research.
6. Warnings for polynomial regression Polynomial regression can be a good way to explain error related to important control variables.
However, polynomial regression creates at least four problems:
1.) It is difficult to interpret any of the coefficients related to the variable with the polynomial specification.
2.) Each “order” in a polynomial regression eats up a degree of freedom.
3.) Polynomial terms tend to be highly collinear.
4.) The model becomes highly unstable at extreme x-values, and don’t even think of extrapolating.
7. Model building with polynomial regression The standard order for adding polynomial terms to a model is to start with the first order term, add the second order term if necessary, and so on.
Never use a model that has a higher order term for X, but is missing a lower-order term for X, even if an F-test indicates otherwise.
8. Interaction regression models (NKNW 7.8) Form of an interaction model:
Yhat = b0 + b1X1 + b2X2 + b3X1X2
When to use an interaction model:
When it appears that the effect of X1on Y varies with the value of X2.
When you wish to test whether the effect of X1on Y varies with the value of X2.
9. Dichotomous interaction terms The easiest type of interaction occurs when both X1 and X2 are scaled as dichotomous variables
Example: what is the effect of college attendance and gender on income?
Data coding: X1(male) X2(college) X1*X2
female, college 0 1 0*1=0
female, no college 0 0 0*0=0
male, college 1 1 1*1=1
male, no college 1 0 1*0=0
Note: three coefficients for four categories (which category does the intercept describe?)
10. Dichotomous interaction coefficients In the income example with X1 and X2 both dichotomous, Yhat = b0 + b1X1 + b2X2 + b3X1X2 We can interpret the coefficients using a 2X2 table: