1 / 14

Estimation techniques for clustered hierarchical data

2. Estimators: a trade off. Trade-offSimplicityOLS the simplest and best understood estimatorRestrictive assumptionsOLS makes assumptions about the data that often do not applye.g., independenceOther estimatorsMore realistic assumptionse.g., that observations are inter-related in various w

hammer
Download Presentation

Estimation techniques for clustered hierarchical data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. 1 Estimation techniques for clustered (hierarchical) data Cluster-robust linear regression and Multilevel modelling

    2. 2 Estimators: a trade off Trade-off Simplicity OLS the simplest and best understood estimator Restrictive assumptions OLS makes assumptions about the data that often do not apply e.g., independence Other estimators More realistic assumptions e.g., that observations are inter-related in various ways e.g., clustering pupils in schools (or classes) Statistical impact ? serial correlation intra-group More complex Computation done by software Need an intuitive understanding of what they do and what they don’t do

    3. 3 Problems arising from clustering (hierarchical data) OLS Assumes observations independent ? Maximum information Survey data Observations often clustered Individuals in families Firms in industries or locations Students in classes ? observations not fully independent reflected in the residuals ? OLS underestimates SEs of regression coefficients ? spurious precision

    4. 4 The problem of dependent observations Units are clustered (= grouped) e.g., students within a particular school tend to be more like each other than students at other schools ? a sample of students from a single school less varied data than a random sample of the same size from all students ? loss of information ? cannot use OLS dependence between observations has to be modelled

    5. 5 Implications for estimation OLS not appropriate ? use different estimators Cluster robust linear regression Adjusts SEs to account for loss of independence ? clustering a nuisance to control for Necessary for “honest” estimates of standard errors’ Multilevel modelling (= Hierarchical linear modelling) Benefit: Explicitly model effects at each level e.g., school/classroom/pupil Identifies where and how effects occur Cost: More powerful assumptions ? results more dependent on assumption of random and normally distributed effects i.e., sensitive to outliers and skewed error distribution If assumptions do not hold, MLM underestimates SEs of higher-level parameters biased parameter estimates cf CRLR Consistent and robust estimates Woesman, 2003, p.11 (note 10)Woesman, 2003, p.11 (note 10)

    6. 6 Estimation and data requirements CRLR STATA 8 LIMDEP 8 Multilevel modelling MLwiN Data requirements Most common: individual data (e.g., pupils) Variables to indicate belonging to higher level units Class (and/or teacher) School Number of levels? In practice, no more than 3 or 4

    7. 7 Multilevel modelling • continuous response • 2-level, 2 variable example Single-level Pupils only Multilevel (1) Pupils Schools different intercepts for each school Multilevel (2) Pupils Schools different intercepts different slopes

    8. 8 Single level model Individual pupils yi = individual test scores xi = individual ability ei = individual error terms difference between actual & predicted scores i indexes pupils 1…n ?0 overall intercept (fixed – all the same) ?1 overall slope (fixed – all the same) Shows how individual test scores related to individual ability ?1 measures the average relationship Shortcoming No measurement of how this average relationship varies between schools ? model both pupil and school effects together

More Related