140 likes | 271 Views
PSC 5940: Running Basic Multi-Level Models in R. Session 6 Fall, 2009. Running Multilevel Models in R. Using lmer: “linear mixed-effects in R” Identify a grouping variable: “state” levels(state) # will show the categories:. > levels(state)
E N D
PSC 5940: Running Basic Multi-Level Models in R Session 6 Fall, 2009
Running Multilevel Models in R • Using lmer: “linear mixed-effects in R” • Identify a grouping variable: “state” • levels(state) # will show the categories: > levels(state) [1] "AK" "AL" "AR" "AZ" "CA" "CO" "CT" "DC" "DE" "FL" "GA" [12] "HI" "IA" "ID" "IL" "IN" "KS" "KY" "LA" "MA" "MD" "ME" [23] "MI" "MN" "MO" "MS" "MT" "NC" "ND" "NE" "NH" "NJ" "NM" [34] "NV" "NY" "OH" "OK" "OR" "PA" "RI" "SC" "SD" "TN" "TX" [45] "UT" "VA" "VT" "WA" "WI" "WV" "WY” Texas is element #44; Oklahoma is element #37; etc.
Running Multilevel Models in R • Re-name some variables for analysis • income<-e130e_co • educ<-e2b_edu • Run a simple linear model for comparison: • OLS1<-lm(income ~ educ) lm(formula = income ~ educ) Residuals: Min 1Q Median 3Q Max -9.2963 -2.5845 -0.5845 1.4600 16.5934 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 2.05071 0.27953 7.336 3.58e-13 *** educ 1.17794 0.07544 15.613 < 2e-16 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 3.704 on 1506 degrees of freedom (190 observations deleted due to missingness) Multiple R-squared: 0.1393, Adjusted R-squared: 0.1387 F-statistic: 243.8 on 1 and 1506 DF, p-value: < 2.2e-16
Running Multilevel Models in R • For a simple-minded intercept-varying model (with no slope coefficients): • ML1<-lmer(income ~ 1 + (1 | state)) Formula: income ~ 1 + (1 | state) AIC BIC logLik deviance REMLdev 8480 8496 -4237 8472 8474 Random effects: Groups Name Variance Std.Dev. state (Intercept) 0.19588 0.44258 Residual 15.68736 3.96073 Number of obs: 1513, groups: state, 51 Fixed effects: Estimate Std. Error t value (Intercept) 6.0937 0.1304 46.75
Running Multilevel Models in R • To see the fixed effect: • fixef(ML1) • Returns the average intercept: 6.093686 • ranef(ML1) • Returns the variation for each state around the mean intercept: $state (Intercept) AK 0.03582853 AL -0.34874818 AR -0.35354326 AZ -0.09795315 CA 0.74016962 CO 0.22587276 (etc.)
Running Multilevel Models in R • A somewhat more interesting ML model: • ML2<-lmer(income ~ educ + (1 | state)) • Returns a model with a fixed slope and varying intercepts. Summary gets you this: Formula: income ~ educ + (1 | state) AIC BIC logLik deviance REMLdev 8238 8259 -4115 8224 8230 Random effects: Groups Name Variance Std.Dev. state (Intercept) 0.13219 0.36357 Residual 13.59123 3.68663 Number of obs: 1508, groups: state, 51 Fixed effects: Estimate Std. Error t value (Intercept) 2.0361 0.2867 7.102 educ 1.1751 0.0757 15.524
Running Multilevel Models in R • To observe the model estimates: • fixef(ML2): (Intercept) educ • 2.036075 1.175145 • ranef(ML2): • Calculation of the intercept for Texas (46th state): • coef(ML2)$state[46,1], returns: • [1] 2.169662 $state (Intercept) AK 3.310271e-02 AL -3.366027e-01 AR -2.271760e-01 AZ -1.131920e-01 CA 4.937171e-01 CO 6.491345e-02 CT 2.490139e-01
Running Multilevel Models in R • To calculate the 95% confidence interval for Texas: • coef(ML2)$state[46,1]+c(-2,2)*se.ranef(ML2)$state[46] [1] 1.527386 2.811937 • The 95% confidence interval for the model slope is: • fixef(ML2)["educ"]+c(-2,2)*se.fixef(ML2)["educ"] • which returns: • [1] 1.023752 1.326537
Running Multilevel Models in R • A still more interesting ML model: • ML2<-lmer(income ~ educ + (1 + educ | state)) • Returns a model with both a varying slope and intercept for each state. Summary gets you this: Formula: income ~ educ + (1 + educ | state) AIC BIC logLik deviance REMLdev 8233 8265 -4111 8216 8221 Random effects: Groups Name Variance Std.Dev. Corr state (Intercept) 0.65751 0.81087 educ 0.13761 0.37096 -1.000 Residual 13.36960 3.65645 Number of obs: 1508, groups: state, 51 Fixed effects: Estimate Std. Error t value (Intercept) 2.1212 0.3172 6.687 educ 1.1431 0.1017 11.235
Running Multilevel Models in R • To observe the model estimates: • fixef(ML3): (Intercept) educ • 2.121166 1.143087 • ranef(ML3): • Calculation of the intercept and slopes for Texas: • coef(ML3)$state[46,1], returns: [1] 1.662176 • coef(ML3)$state[46,2], returns: [2] 1.353068 $state (Intercept) educ AK -0.062791841 0.028726346 AL 1.064733054 -0.487099757 AR 0.716358907 -0.327723694 AZ -0.174025953 0.079614321 CA -0.970880883 0.444163765 CO -0.594929356 0.272171455 CT -0.951004214 0.435070481
Workshop 1: • Build ML Model using • Ideology to Predict GHG Risk • Use the state variable as the group level • How much is the model residual reduced by allowing states to vary? • Present it to me in 20 min.
Workshop 2: • Data presentations • Sources, characteristics • Preliminary group-level models?
For Next Week • Read Gelman & Hill Ch. 13 • Build plots: • Figure out how to replicate Figure 12.4 (p. 257) • code is shown on p. 262. • Present your initial group-level models