250 likes | 266 Views
Multilevel modelling : general ideas and uses. 3 0.5.2017 Kari Nissinen Finnish Institute for Educational Research. Hierarchical data. Data in question is organized in a hierarchical / multilevel manner
E N D
Multilevelmodelling:general ideas and uses 30.5.2017 Kari Nissinen Finnish Institute for Educational Research
Hierarchical data • Data in question is organized in a hierarchical / multilevel manner • Units at lowerlevel (1-5) arearranged into higher-levelunits (A, B) A B 1 2 3 4 5
Hierarchical data • Examples • Studentswithinclasseswithinschools • Employeeswithinworkplaces • Partners in couples • Residentswithinneighbourhoods • Nestlingswithinbroodswithinpopulations… • Repeatedmeasureswithinindividuals
Hierarchical data • The keyissue is clustering • lower-levelunitswithin an upper-levelunittend to bemorehomogeneousthantwoarbitrarylower-levelunits • E.g. studentswithin a class: intra-clustercorrelation ICC (positive) • Repeatedmeasures: autocorrelation(usuallypositive)
Hierarchical data • Clustering => lower-levelunitsarenotindependent • In cross-sectionalstudiesthis is a problem • Twocorrelatedobservationsprovidelessinformationthantwoindependentobservations (partial ’overlap’) • Efficientsamplesizesmallerthannominalsamplesize => statisticalinferencefalselypowerful
Clustering in cross-sectionalstudies • Basic statisticalmethodsdonotrecognize the dependence of observations • Standard errors(variances) underestimated=> confidenceintervalstooshort, statisticalteststoosignificant • Specialmethodologyneeded for correctvariances… • Design-basedapproaches (varianceestimation in clustersamplingframework) • Model-basedapproaches: multilevelmodels
Clustering in cross-sectionalstudies • Measureof ’inferenceerror’ due to clustering: design effect (DEFF) = ratio of correctvariance to underestimatedvariance (no clusteringassumed) A function of ratio of nominalsamplesizeto effectivesamplesizeand/orhomogeneitywithinclusters (ICC)
Hierarchical data • Hierarchy is a property of population, whichcancarryover into the sample data • Cluster sampling: hierarchy is explicitlypresent in data collection => data possess the samehierarchy(and possibleclustering) exactly • Simplerandomsampling (etc): clusteringmayormaynotappear in the data • It is presentbuthidden, maybedifficult to identify • Effectmaybenegligible
Hierarchical data • Hierarchydoesnotalwayslead to clustering:unitswithin a clustercanbeuncorrelated • Other side of the coin is heterogeneitybetweenupper-levelunits: if no heterogeneity, then no homogeneityamonglower-levelunits • Zero ICC => no need for specialmethodology • Clustering canaffectsometargetvariables, butnotsomeothers
Longitudinal data • Clustering = measurements on an individualarenotindependent • Whenanalyzingchangethis is a benefit • Eachunitsserves as itsown ’controlunit’ (’block design’) => ’true’ change • Autocorrelation ’carries’ thislinkfromtimepoint to another • Appropriatemethodsutilizethiscorrelation => powerfulstatisticalinference
Mixedmodels • An approach for handlinghierarchical / clustered / correlated data • Typically regression or ANOVA models, whichcontaineffects of explanatoryvariables, whichcanbe(i) fixed, (ii) randomor (iii) both • Linearmixedmodels: errordistributionnormal (Gaussian) • Generalizedlinearmixedmodels: errordistributionbinomial, Poisson, gamma, etc
Mixedmodels • Variance component models • Randomcoefficient regression models • Multilevelmodels • Hierachical (generalized) linearmodels • Allthesearespecialcases of mixedmodels • Similarestimationprocedures (maximumlikelihood & itsvariants), etc
Fixedvsrandomeffects • 1-way ANOVA fixedeffectsmodel Y(ij) = μ + α(i) + e(ij) • μ = fixedintercept, grandmean • α(i) = fixedeffect of group i • e(ij) = randomerror (’randomeffect’) of unitij • random, becauseit is drawnfrom a population • ithas a probabilitydistribution (often N(0,σ²))
Fixedvsrandomeffects • Fixedeffectsdetermine the means of observations E(Y(ij)) = μ + α(i), since E(e(ij))=0 • Randomeffectsdetermine the variances (& covariances/correlations) of observations Var(Y(ij)) = Var(e(ij)) = σ²
Fixedvsrandomeffects • 1-way ANOVA randomeffectsmodel Y(ij) = μ + u(i) + e(ij) • μ = fixedintercept, grandmean • u(i) = randomeffect of group i • randomwhen the groupis drawnfrom a population of groups • hasa probabilitydistributionN(0,σ(u)²) • e(ij) = randomerror (’randomeffect’) of unitij
Fixedvsrandomeffects • Now the mean of observations is just E(Y(ij)) = μ • Varianceis Var(Y(ij)) = Var(u(i) + e(ij)) = σ(u)² + σ² • Sum of twovariancecomponents => variancecomponentmodel
Randomeffects and clustering • Randomgroup => unitsij and ikwithingroup i arecorrelated: Cov(Y(ij),Y(ik)) = Cov(u(i) + e(ij), u(i) + e(ik)) = Cov(u(i), u(i)) = σ(u)² • Positiveintra-clustercorrelation ICC = Cov(Y(ij),Y(ik)) / Var(Y(ij)) = σ(u)² / (σ(u)² + σ²)
Mixedmodel • Containsbothfixed and randomeffects, e.g. Y(ij) = μ + βX(ij) + u(i) + e(ij) • i = school, j = student • μ = fixedintercept • β = fixedregression coefficient • u(i) = randomschooleffect (’schoolintercept’) • e(ij) = randomerrorof student j in school i
Mixedmodel Y(ij) = μ + βX(ij) + u(i) + e(ij) • The mean of Y is modelled as a function of explanatoryvariable Xthrough the fixedparametersμand β • The variance of Y and within-clustercovariance (ICC)aremodelledthrough the randomeffects u (’level 2’) and e (’level 1’) • This is the general idea; extendsversatilely
An extension: randomcoefficient regression Y(ij) = μ + βX(ij) + u(i) + v(i)X(ij) + e(ij) • v(i) = randomschoolslope • Regression coefficient of X variesbetweenschools: β+ v(i) • A ’side effect’: the variance of Y variesalong with X • onepossibleway to modelunequalvariances (as a function of X)
Regression for repeatedmeasures data Y(it) = μ(t) + βX(it) + e(it) • t= time, μ(t) = intercept at time t • i = individual • The errors e(it) of individual i correlated: different (auto)correlationstructures(e.g. AR(1))canbefitted as well as differentvariancestructures (unequalvariances)