1 / 15

PSC 5940: Estimating the Fit of Multi-Level Models

PSC 5940: Estimating the Fit of Multi-Level Models. Session 8 Fall, 2009. Log-likelihood - GLMs. Given a linear model:. Allows errors to be correlated. With possibly correlated errors:. The log-likelihood is defined as:.

pearly
Download Presentation

PSC 5940: Estimating the Fit of Multi-Level Models

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PSC 5940: Estimating the Fit of Multi-Level Models Session 8 Fall, 2009

  2. Log-likelihood - GLMs Given a linear model: Allows errors to be correlated With possibly correlated errors: The log-likelihood is defined as: Given a linear model, differentiating the Generalized Sum of Squares wrtβ, setting partial derivative to zero, and solving for β, produces: Look familiar?

  3. Understanding MLE • For OLS, the the MLE is the SSE when β=(X’X)-1(X’Y) • For log models, the MLS is the product of the (logged) errors given the formula for the Generalized Sum of Squares • The latter differs from OLS as various assumptions are relaxed (nonlinearity, correlated errors, etc) • The results of the likelihood function can be viewed topographically, as a “hill” showing the effect on the LE as you vary the estimated coefficients. The maximum-LE will be the peak of the hill. • The “fit” measures (adj R2, AIC, BIC) are telling you how high (or low) the peak is for a given model. • Comparisons of the fit measures across models can assist in model selection

  4. Measures of Model Fit • R2 and adj R2: • AIC: • Where is the maximized log-likelihood for the model • AIC penalizes for the complexity of the model • BIC: • Where n is the number of observations • BIC penalizes for increased model complexity and sample size (results in preference for simpler models)

  5. BIC Test for “Improvement in Model Fit”

  6. Example in R: LM1 • Predicting votes in referendum on alternative energy tax (erdf100<-e63_erdf) by price and region: • Explanatory variables: • Randomly assigned values: ($6 to $2400 p/y) • price<-random_p • Region (already so named) • ML1<-lmer(erdf100~1+(1|price)+(1|region))

  7. Example in R: LM2 • Predicting votes in referendum on alternative energy tax (erdf100<-e63_erdf) by price, region and perceived risks posed by GCC: • Explanatory variables: • Randomly assigned values: ($6 to $2400 p/y) • price<-random_p • Region (already so named) • Risk_GCC (0-10 scale) • ML2<-lmer(erdf100~1+risk_GCC+(1|price)+(1|region))

  8. LM1 Result: > summary(ML1) Linear mixed model fit by REML Formula: erdf100 ~ 1 + (1 | price) + (1 | region) AIC BIC logLik deviance REMLdev 15275 15296 -7634 15271 15267 Random effects: Groups Name Variance Std.Dev. price (Intercept) 97.5853 9.8785 region (Intercept) 8.7641 2.9604 Residual 1113.6147 33.3709 Number of obs: 1546, groups: price, 15; region, 5 Fixed effects: Estimate Std. Error t value (Intercept) 59.026 3.064 19.26

  9. LM2 Result: Linear mixed model fit by REML Formula: erdf100 ~ 1 + risk_gcc + (1 | price) + (1 |region) AIC BIC logLik deviance REMLdev 14960 14987 -7475 14954 14950 Random effects: Groups Name Variance Std.Dev. price (Intercept) 99.8298 9.9915 region (Intercept) 6.6622 2.5811 Residual 990.3544 31.4699 Number of obs: 1532, groups: price, 15; region, 5 Fixed effects: Estimate Std. Error t value (Intercept) 30.9259 3.6408 8.494 risk_gcc 4.1735 0.3068 13.604

  10. BIC Test Results: BIC for ML1 – BIC for ML2 = 15296 – 14987 = 309; “Conclusive” Note: You can have R calculate the difference: BIC(logLik(ML2))-BIC(logLik(ML1))

  11. Model fit: BIC and Thinking • Use of BIC is often (ill)used like a statistical “idiot light” • Depends on sample employed • Maximizes predictive capacity of model rather than model explanation • When you face a decision ofwhether to add an explanatory variable: use a 2-step process • Does the variable make theoretical sense? • Does BIC show improved model fit? • If answers to both are “yes”, then add the variable

  12. BREAK

  13. R Coding • When modeling with only part of your data, use “subset” • lmer(y~x…, subset=state==“NY”)

  14. Workshop • New developments on Models? • Progress on Papers • Research question motivated by literature reviews

  15. Next Week • Focus is on paper progress • Build timelines for completion • Focus on challenges, and what we need to do to surmount them • Hone your research question: motivated by literature reviews • Need 1-page progress reports, including task assignments

More Related