690 likes | 1.3k Views
Rens van de Schoot a.g.j.vandeschoot@uu.nl / rensvandeschoot.wordpress.com. Introduction Multilevel Analysis. Multilevel Regression Model. Known in literature under a variety of names Hierarchical linear model (HLM) Random coefficient model Variance component model Multilevel model
E N D
Rens van de Schoot a.g.j.vandeschoot@uu.nl / rensvandeschoot.wordpress.com Introduction Multilevel Analysis
Multilevel Regression Model Known in literature under a variety of names • Hierarchical linear model (HLM) • Random coefficient model • Variance component model • Multilevel model • Contextual analysis • Mixed Linear Model
Hierarchical Data Structure • Three level data structure • Groups at different levels may have different sizes • Response (outcome) variable at lowest level • Explanatory variables at all levels
Traditional Approaches • Disaggregate all variables to the lowest level • Do standard analyses (anova, multiple regression) • Aggregate all variables to the highest level • Do standard analyses (anova, multiple regression) • Ancova with groups as factor • Some improvements: • explanatory variables as deviations from their group mean have both deviation score and disaggregated group mean as predictor (separates individual and group effects) • Why not? What is wrong with this?
Problems With Standard Analysisof Hierarchical Data • Multiple Regression assumes • independent observations • independent error terms • equal variances of errors for all observations • (assumption of homoscedastic errors) • normal distribution for errors • With hierarchical data • observations are not independent • errors are not independent • different observations may have errors with different variances (heteroscedastic errors)
Problems With Standard Analysis of Hierarchical Data • Observations in the same group are generally not independent • they tend to be more similar than observations from different groups • selection, shared history, contextual group effects • The degree of similarity is indicated by the intraclass correlation rho: r • Standard statistical tests are not at all robust against violation of the independence assumption That is why we need special multilevel techniques!
Sample size? • Hox, J.,van de Schoot. R., & Matthijsse, S. (2012). How few countries will do? Comparative survey analysis from a Bayesian perspective. Survey Research Methods, Vol.6, No.2, pp. 87-93
Research questions I/III • Questions with respect to variables at the lowest level • Intelligence (IQ) as predictor of school achievement (SA)
Research questions II/III • Questions with respect to the influence of variables at a higher level on the dependent variable on the lowest level • Mean intelligence of a class (MIQ) as predictor of school achievement (SA); (control for individual IQ)
Research questions III/III • Questions with respect to the interaction of variables on different levels (moderation effect) • The relation between intelligence and school achievement is not the same in all classes
Graphical Picture of SimpleTwo-level Regression Model • Outcome variable on pupil level • Explanatory variables at both levels: individual & group • Residual error at individual level • Plus residual error at school level School level Pupil level
Regression analysis In ordinary regression, with one explanatory variable X: • Yi= b0+ b1Xi+ ei • b0 intercept, • b1 regression slope, • ei residual error term
Building the Multilevel Regression Model: Random intercept model In multilevel regression, at the lowest level: • Yij= b0j+ b1jXij+ eij • b0j intercept, • b1j regression slope, • eij residual error term • subscript i for individuals, j for groups • each group has its own intercept coefficient b0j • and its own slope coefficient b1j
Building the Multilevel Regression Model: Intercept only model In multilevel regression, at the lowest level: • Yij= b0j+ eij • Random intercept model: • b0j= g00+ u0j • g00 is the intercept of b0j u0j is the residual error term in the equation for b0j
Building the Multilevel Regression Model: Random intercept model In multilevel regression, at the lowest level: • Yij= b0j+ b1jXij+ eij • Random intercept model: • b0j= g00+ u0j • g00 is the intercept of b0j u0j is the residual error term in the equation for b0j
Building the Multilevel Regression Model: Random intercept model
Building the Multilevel Regression Model: Intercept only model • Yij= b0j+ b1jXij+ eij • Random intercept model: • b0j= g00+ u0j • g00 is the intercept of b0j u0j is the residual error term in the equation for b0j • Random slope model: • b1j= g10+ u1j • g10 is the intercept of ß1j • u1j is the residual error term in the equation for b1j
Difference with the usual regression model: • Each class has a different intercept coefficient b0j and a different slope coefficient b1j • Since the intercept and the slope coefficients vary across the classes: random coefficients => Random intercept model & random slope model
Building the Multilevel Regression Model: Random slope model
Buildingthe Multilevel Regression Model: the Second (Group) Level • Next step: • explain the variation of the regression coefficients b0j and b1j by introducing explanatory variables at the class level
Building the Multilevel Regression Model: the Second (Group) Level At the lowest (individual) level we have • Yij= b0j+ b1jXij+ eij • b0j= g00+ g01Zj+ u0j • g00 and g01 are the intercept and slope to predict b0j from Zj • u0j is the residual error term in the equation for b0j
Building the Multilevel Regression Model: Cross level interaction At the lowest (individual) level we have • Yij= b0j+ b1jXij+ eij • b0j= g00+ g01Zj+ u0j • g00 and g01 are the intercept and slope to predict b0j from Zj • u0j is the residual error term in the equation for b0j • b1j= g10+ g11Zj+ u1j • g10 and g11 are the intercept and slope to predict ß1j from Zj • u1j is the residual error term in the equation for b1j
Building the Multilevel Regression Model: Single Equation Version At the lowest (individual) level we have • Yij= b0j+ b1jXij+ eij and at the second (group) level • b0j= g00+ g01Zj+ u0j • b1j= g10+ g11Zj+ u1j Combining (substitution and rearranging terms) gives • Yij= g00+ g10Xij+ g01Zj+ g11ZjXij+ u1jXij+ u0j+ eij
Building the Multilevel Regression Model: Single Equation Version Yij= [g00+ g10Xij+ g01Zj+ g11ZjXij] + [u1jXij+ u0j+ eij] • This equation has two distinct parts • [g00+ g10Xij+ g01Zj+ g11ZjXij] contains all the fixed coefficients, it is called the fixed part of the model • [u1jXij+ u0j+ eij] contains all the random error terms, it is called the random part of the model
Building the Multilevel Regression Model: Interpretation Yij = [g00+ g10Xij+ g01Zj+ g11ZjXij] + [u1jXij+ u0j+ eij] • Several error variances • e2 variance of the lowest level errors eij • s2u0 variance of the highest level errors u0j • s2u1 variance of the highest level errors u1j • su01 covariance of u0j and u1j
Full Multilevel Regression Model • Explanatory variables at all levels • Higher level variables predict variation of lowest level intercept and slopes • Predicting the intercept implies a direct effect • Predicting slopes implies cross-level interactions
Model Exploration 1 Intercept-only model • calculate intraclass correlation 2 Fixed model, 1st level predictor variables • test individual slopes for significance 3 Model intercept by 2nd level predictor variables • test for significance, how much intercept variance explained? 4 Random coefficient model • test if any 1st level slope has a significant variance component (this is best done one-by-one) 5 Model random slopes by higher level variables: cross level interactions • test for significance, how much slope variance is explained?
Example: Popularity in Schools • Outcome: popularity rating • 100 classes, 2000 pupils • Explanatory variables • Pupil level: sex (0=boy, 1=girl) • Class level: teacher experience (in years)
Popularity Example:Intercept-only Model • Popularityij = g00+ u0j+ eij • Estimates (st. err.) • g00 = 5.31 (.10) (This is just the overall average popularity) • se2 = 0.64 (.02) • s2u0 = 0.88 (.13)
Popularity Example:Fixed Model • Popularityij = g00 + g10sexij + u0j + eij • Estimates (st. err.) • g00 = 4.89 (.10), • g10 = 0.84 (.03) • se2 = 0.46 (.02) • s2u0 = 0.85 (.12)
Popularity Example:Fixed Model + Higher Level Variable • Popularityij = g00 + g10sexij + g01t.exp.j + u0j + eij • Estimates (st. err.) • g00 = 3.56 (.17), • g10 = 0.84 (.03), • g01 = 0.09 (.01) • se2 = 0.46 (.02) • s2u0 = 0.48 (.07)
Popularity Example:Random Coefficient Model • Popularityij = g00 + g10sexij + g01t.exp.j + u0j + u1jsexij + eij • Estimates (st. err.) • g00 = 3.34 (.16), g10 = 0.84 (.06), g01 = 0.11 (.01) • se2 = 0.39 (.01) • s2u0 = 0.41 (.06) • su01 = 0.02 (.04) (covariance between intercept and slope) • s2u1 = 0.27 (.05) • Slope variation for sex
Popularity Example:Random Coefficient Model + Interaction • Popularityij = g00 + g10sexij + g01t.exp.j + g11sexijt.exp.j + u0j + u1jsexij + eij • Estimates (st. err.) • g00 = 3.31 (.16), g10 = 1.33 (.13), g01 = 0.11 (.01), • g11 = -0.03 (.01) • se2 = 0.39 (.01) • s2u0 = 0.40 (.06) • su01 = 0.02 (.04) • s2u1 = 0.22 (.04) • Smaller, but still significant slope variation for sex
5-day course Multilevel Analyses in Mplus • 21-25 jan. 2013 • http://www.uu.nl/faculty/socialsciences/NL/organisatie/graduateschool/promoveren/onderwijs%20voor%20promovendi/courseoffering/Pages/Multilevel-Analyses-using-Mplus.aspx • The 9th International Multilevel Conference is on March 27-28 (2013). http://multilevel.fss.uu.nl/ • Prior to the conference (26th of March) a one-day course is taught by prof. Stef van Buuren on Mutiple Imputation of Multilevel missing data in MICE. • 5th Mplus users meeting will be organized, 25th of March http://mplus.fss.uu.nl