120 likes | 253 Views
Sociology 601 Class 28: December 8, 2009. Homework 10 Review polynomials interaction effects Logistic regressions log odds as outcome compared to linear model predicting p odds ratios vs. log odds coefficients inferential statistics: Wald test
E N D
Sociology 601 Class 28: December 8, 2009 • Homework 10 • Review • polynomials • interaction effects • Logistic regressions • log odds as outcome • compared to linear model predicting p • odds ratios vs. log odds coefficients • inferential statistics: Wald test • maximum likelihood models and likelihood ratios • loglinear models for categorical variables: e.g. 5x5 mobility table • logit models for ordinal variables • event history models
Sociology 601 Class 28: December 8, 2009 • Homework 10 (Thursday) • Review: interaction effects • F-tests • review: for full equation • for partial model • Multicollinearity • example: state murder rates • not an issue for polynomials, multiplicative interactions • Next class: review • email us topics you want reviewed!
Review: Regression with Interaction effects • Two approaches: • separate regressions by groups (e.g., two regressions one for men and one for women) • multiplicative interaction term in one regression • same estimates for effects in either way • multiplicative interaction term provides a significance test of difference • multiplicative interaction term less easily interpreted • Multiplicative interaction models • types: categorical (e.g., gender, race) or interval (e.g., age) • first, main concern: is interaction coefficient statistically significant? • “component” coefficients are just estimates when the other component = zero • plotting helps
Inferences: F-tests Comparing models Comparing Regression Models, Agresti & Finlay, p 409: Where: Rc2 = R-square for complete model, R r2 = R-square for reduced model, k = number of explanatory variables in complete model, g = number of explanatory variables in reduced model, and N = number of cases.
Example: F-tests Comparing models • Complete model: men’s earnings on • age, • age square, • age cubed, • education, and • currently married dummy. • Reduced model: men’s earnings on • education and • currently married dummy. • F-test comparing model is whether age variables, as a group, have a significant relationship with earnings after controls for education and marital status
Example: F-tests Comparing models • Complete model: men’s earnings • . regress conrinc age agesq agecu educ married if sex==1 • Source | SS df MS Number of obs = 725 • -------------+------------------------------ F( 5, 719) = 45.08 • Model | 1.1116e+11 5 2.2233e+10 Prob > F = 0.0000 • Residual | 3.5461e+11 719 493199914 R-squared = 0.2387 • -------------+------------------------------ Adj R-squared = 0.2334 • Total | 4.6577e+11 724 643334846 Root MSE = 22208 • ------------------------------------------------------------------------------ • conrinc | Coef. Std. Err. t P>|t| [95% Conf. Interval] • -------------+---------------------------------------------------------------- • age | 5627.049 8127.377 0.69 0.489 -10329.18 21583.27 • agesq | -75.30909 210.0421 -0.36 0.720 -487.6781 337.0599 • agecu | .1985975 1.768176 0.11 0.911 -3.272807 3.670003 • educ | 3555.331 317.9738 11.18 0.000 2931.063 4179.599 • married | 8664.627 1690.098 5.13 0.000 5346.51 11982.74 • _cons | -127148.4 102508.3 -1.24 0.215 -328399.8 74103.01 • ------------------------------------------------------------------------------ • Note: none of the three age coefficients are, by themselves, statistically significant. • Rc2 = .2387; k = 5.
Example: F-tests Comparing models • Reduced model: men’s earnings • . regress conrinc educ married if sex==1 • Source | SS df MS Number of obs = 725 • -------------+------------------------------ F( 2, 722) = 80.20 • Model | 8.4666e+10 2 4.2333e+10 Prob > F = 0.0000 • Residual | 3.8111e+11 722 527850916 R-squared = 0.1818 • -------------+------------------------------ Adj R-squared = 0.1795 • Total | 4.6577e+11 724 643334846 Root MSE = 22975 • ------------------------------------------------------------------------------ • conrinc | Coef. Std. Err. t P>|t| [95% Conf. Interval] • -------------+---------------------------------------------------------------- • educ | 3650.611 328.1065 11.13 0.000 3006.454 4294.767 • married | 10721.42 1716.517 6.25 0.000 7351.457 14091.38 • _cons | -16381.3 4796.807 -3.42 0.001 -25798.65 -6963.944 • ------------------------------------------------------------------------------ • Rr2 = .1818; g = 2.
Inferences: F-tests Comparing models F = ( 0.2387 – 0.1818) / (5 – 2) df1=5-2; df1=725-6 ( 1 - .2387) / (725 – 6) = 0.0569/3 0.7613/719 = 26.87, df=(3,719), p < .001 (Agresti & Finlay, table D, page 673)
Multicollinearity (A&F 14.3) • “Redundant” variables • large standard errors • loss of statistical significance • variable 1 is significant in Model 1 • variable 2 is significant in Model 2 • neither 1 nor 2 is significant in Model 3 including both variables. • sometimes: strange sign of coefficient • sometimes: magnitude jumps unrealistically • problem is not enough cases high on 1 and low on 2 and vice-versa. Every case that is high on 1 is also high on 2. So, you can’t separate the two effects in this sample.
Multicollinearity: Solutions • Choose one (and footnote the other) • Get a bigger or better sample • If both variables are alternate measures of the same concept, make a scale.
Multicollinearity: Not a problem always • Only if you are trying to separate the effects of variable 1 and variable 2 • what is the effect of variable 1 holding variable 2 constant? • Not an issue if: • polynomials • multiplicative interaction effects
Next: Review for Final • Please email us any topics you want reviewed!