350 likes | 454 Views
Advanced Business Research Method Intructor : Prof. Feng-Hui Huang. An Introduction to Hierarchial Linear Modeling Heather Woltman, Andrea Feldstain, J. Christine MacKay, Meredith Rocchi Tutorial in Quantitative Methods for Psychology 2012, Vol 8(1) p. 52-69. Agung D. Buchdadi DA21G201.
E N D
Advanced Business Research Method Intructor : Prof. Feng-Hui Huang An Introduction to Hierarchial Linear ModelingHeather Woltman, Andrea Feldstain, J. Christine MacKay, Meredith RocchiTutorial in Quantitative Methods for Psychology 2012, Vol 8(1) p. 52-69 Agung D. Buchdadi DA21G201
Contents • Introduction • Methods for Dealing with Nested Data • Equations underlying Hierarchical Linear Models • Estimation of Effects • Hypothesis Testing • Conclusion
Introduction • Hierarchical levels of grouped data are a commonly occurring phenomenon. For example, in the education sector, data are often organized at student,classroom, school, and school district levels. • Hierarchical Linear Modeling (HLM) is a complex formof ordinary least squares (OLS) regression that is used toanalyze variance in the outcome variables when thepredictor variables are at varying hierarchical levels
Introduction • The development of this statisticalmethod occurred simultaneously across many fields, it hascome to be known by several names, including multilevel-,mixed level-, mixed linear-, mixed effects-, random effects-, random coefficient (regression)-, and (complex) covariance components-modeling . • HLM simultaneously investigates relationshipswithin and between hierarchical levels of grouped data,thereby making it more efficient at accounting for varianceamong variables at different levels than other existing analyses.
Introduction Research Question: Do student breakfast consumption and teaching style influence student GPA?
Method for Dealing with Nested Data DISAGGREGATION • Disaggregation of data deals with hierarchical dataissues by ignoring the presence of group differences. Itconsiders all relationships between variables to be contextfree and situated at level-1 of the hierarchy. • Disaggregation thereby ignores the presence of possible between-group variation
Method for Dealing with Nested Data DISAGGREGATION
Method for Dealing with Nested Data DISAGGREGATION
Method for Dealing with Nested Data DISAGREGATION • By bringing upper level variables down to level-1,shared variance is no longer accounted for and theassumption of independence of errors is violated. If teachingstyle influences student breakfast consumption, for example,the effects of the level-1 (student) and level-2 (classroom)variables on the outcome of interest (GPA) cannot be disentangled.
Method for Dealing with Nested Data AGGREGATION • Instead of ignoring higher level group differences, aggregation ignoreslower level individual differences. Level-1 variables areraised to higher hierarchical levels (e.g., level-2 or level-3)and information about individual variability is lost. • Inaggregated statistical models, within-group variation isignored and individuals are treated as homogenous entities
Method for Dealing with Nested Data AGGREGATION
Method for Dealing with Nested Data AGGREGATION
Method for Dealing with Nested Data AGGREGATION • Using aggregation, the predictor variable (breakfast consumption)is again negatively related to the outcome variable (GPA). Inthis method of analysis, all (X, Y) units are situated on theregression line, indicating that unit increases in aclassroom’s mean breakfast consumption perfectly predict alowering of that classroom’s mean GPA. • Although anegative relationship between breakfast consumption andGPA is found using both disaggregation and aggregationtechniques, breakfast consumption is found to impact GPA more unfavourably using aggregation.
Method for Dealing with Nested Data HLM • using HLM each level-1(X,Y) unit (i.e., each student’s GPA and breakfastconsumption) is identified by its level-2 cluster (i.e., thatstudent’s classroom). Each level-2 cluster’s slope (i.e., eachclassroom’s slope) is also identified and analyzed separately. • Using HLM, both the within- and between-groupregressions are taken into account to depict the relationshipbetween breakfast consumption and GPA. • The resultinganalysis indicates that breakfast consumption is positivelyrelated to GPA at level-1 (i.e., at the student level) but thatthe intercepts for these slope effects are influenced by level-2factors [i.e., students’ breakfast consumption and GPA (X, Y)units are also affected by classroom level factors]. • Althoughdisaggregation and aggregation methods indicated anegative relationship between breakfast consumption andGPA, HLM indicates that unit increases in breakfastconsumption actually positively impact GPA.
Method for Dealing with Nested Data HLM • HLM can be ideally suited for the analysis of nested data because it identifies therelationship between predictor and outcome variables, bytaking both level-1 and level-2 regression relationships into account. • In addition to HLM’s ability to assess cross-level datarelationships and accurately disentangle the effects ofbetween- and within-group variance, it is also a preferredmethod for nested data because it requires fewerassumptions to be met than other statistical methods . • HLM can accommodate nonindependenceof observations, a lack of sphericity, missingdata, small and/or discrepant group sample sizes, andheterogeneity of variance across repeated measures. HLM Disadvantage • A disadvantage of HLM is that it requires large samplesizes for adequate power. This is especially true whendetecting effects at level-1. • As well, HLM can only handlemissing data at level-1 and removes groups with missingdata if they are at level-2 or above. • For both of these reasons,it is advantageous to increase the number of groups asopposed to the number of observations per group. A studywith thirty groups with thirty observations each (n = 900)an have the same power as one hundred and fifty groupswith five observations each (n = 750; Hoffman, 1997).
Equations underlying Hierarchical Linear Models • This study will explain two level of hierarchical variable on the previous example. In this two-level hierarchical models, separate level 1 (students) are developed for each level 2 unit (classrooms) • These models are also called within-unit models. First, It take the simple regression form:
Equations underlying Hierarchical Linear Models These models are also called within-unit models. First, It take the simple regression form:
Equations underlying Hierarchical Linear Models • In the level-2 models, the level-1 regression coefficients are used as outcome variables and are related to each of the level-2 predictors
Equations underlying Hierarchical Linear Models • The model developed would depend on the pattern of variance in the level-1 intercepts and slopes.If there was no variation in slopes then the equation 4 would be deleted. • The assumption in level-2 models (equations(5)):
Equations underlying Hierarchical Linear Models Then, combined model (eq.6) : (substituting equation 3 and equation 4 to equation 1) Eq. 6 is often termed a mixed model, combination of fixed effect and random effect.
Estimation of Effects Two-level hierarchical models involve the estimation of three types of parameters. The first type of parameter is fixed effects = (γ00, γ01,γ11,γ10) in eq. 3 and eq. 4, and these do not vary across groups. While level-2 fixed effect could be estimated using OLS approach, it is not good strategy since the homoscedasticity assumption could not be met. The techniques used, then, is a Generalized Least Square (GLS) estimate which allocate more weight on the level-2 regression equation. (further reading : Raudenbush & Bryk (2002))
Estimation of Effects • The second type of parameter is the random level-1 coefficients (β0j amdβ1j) which are permitted to vary across groups • Hierarchical models provide two estimates: • Computing OLS regression for level-1 • Predicted the value of the parameters in the level-2 models (eq.3 & eq.4) Software HLM could provide the best estimation which provides smaller mean square error term. Further reading ( Carlin and Louis 1996)
Estimation of Effects • The final type of parameter estimation concerns the variance-covariance components which include: • The covariance between level-2 error term • The variance in the level-1 error term • The variance in the level-2 error term • If data balance, closed-form formulas can be used to estimate variance and covariance components • If data not balance (most probable in reality): full maximum likelihood, restricted maximum likelihood, and Bayes estimation. • Further reading (Raudenbush and Bryk (2002))
Hypothesis Testing Equation upon the case: Condition 1:There is a systematic within- and between-group variance in GPA The first condition provides useful preliminary information and assures that there is appropriate variance to investigate the hypotheses
Hypotheses Testing • The relevant sub models for condition 1: Level-1: GPAij = β0j + rij ............(10) Level-2: β0j= γ00 + U0j ..............(11) Where: β0j= mean GPA for classroom j γ00 = grand mean GPA Variance (rij) = σ2 =within group variance in GPA Variance (U0j) = τ00 = between groups variance in GPA HLM test for significance of the between-groups variance (τ00) but not within groups Variance (GPAij)= τ00+σ2 Then ICC = τ00/ (τ00 +σ2) Once this condition is satisfied, HLM can examine the next two condition to determine whether there are significant differences in intercepts and slopes across clasrooms
Hypothesis Testing • Condition 2 and 3: There is significant variance in the level-1 intercept and slope The relevant sub model for this condition: Level-1: GPAij = β0j+ βij (Breakfast) +rij ....(13) Level-2: β0j = γ00 + U0j ..............(14) Level-2: β0j = γ10+ U1j..............(15) Where: γ00= mean of the intercepts across classrooms γ10= mean of the Slope across classrooms (H1) Variance (rij) = σ2=Level-1 residual variance Variance (U0j) = τ00 = variance in intercepts Variance (U1j) = τ11= variance in slopes
Hypothesis Testing • HLM runs t-test to asses whether γ00 and γ10 differ significantly from zero. • The χ2 test is used to asses whether the variance in intercept and slopes differs significantly from zero. • Using both test (condition 1 and condition 2) HLM calculate the percent of variance in GPA that is accounted fro by breakfast consumption wtih the formula :
Hypothesis Testing • Condition 4: The variance in the level-1 intercept is predicted by teaching style • It could be tested only the condition 2 and condition 3 are fulfilled Equations: Level-1: GPAij = β0j + βij (Breakfast) +rij ....(16) Level-2: β0j = γ00+ γ01(Teaching Style)+U0j(17) Level-2: β0j = γ10 + U1j ..............(18)
Hypothesis Testing Where: γ00 = level-2 intercept γ01= level-2 slope (H2) γ10= mean (pooled) slopes Variance (rij) = σ2 =Level-1 residual variance Variance (U0j) = τ00 = residual variance in intercepts Variance (U1j) = τ11 = variance in slopes The t-test for intercept and slope conducted is similar to test in condition 2 and 3. The residual variance (τ00) is assessed using another χ2 test. The variance of GPA is accounted for by teaching style is compared to the total intercept variance :
Hypothesis Testing • Condition 5: The variance in the Level-1 slope is predicted by teaching style Equations: Level-1: GPAij = β0j + βij (Breakfast) +rij ....(16) Level-2: β0j = γ00 + γ01(Teaching Style)+U0j(17) Level-2: β0j = γ10+ γ11(Teaching Style)+U1j(18) Where: γ00 = level-2 intercept γ01 = level-2 slope (H2) γ10 = level-2 intercept γ11= level-2 slope (H3) Variance (rij) = σ2 =Level-1 residual variance Variance (U0j) = τ00 = residual variance in intercepts Variance (U1j) = τ11 = residual variance in slopes
Hypothesis Testing With Teaching style as a predictor of the level-1 slope, U1jbecomes a measure of the residual variance in the averaged level-1 slopes across groups. If a χ2 test on U1j is significant, it indicates that there is systematic variance in the level-1 slopes, therefore other level-2 predictor can be added to the model. • The percent of variance attributable to teaching style can be computed as a moderator in the breakfast-GPA relationship by the formula:
Conclusion • HLM has risen in popularity as the method of choice for analyzing nested data • HLM is a multistep, time-consuming process. Prior to conducting an HLM analysis, background interaction effects between predictor variables should be accounted for, and sufficient amounts of within- and between-groups variance. HLM presumes that data is normally distributed: any violations the output could be biased.