180 likes | 378 Views
The Choice Between Fixed and Random Effects Models: Some Considerations For Educational Research. Clarke, Crawford, Steele and Vignoles and funding from ESRC ALSPAC Large Grant. Motivation.
E N D
The Choice Between Fixed and Random Effects Models: Some Considerations For Educational Research Clarke, Crawford, Steele and Vignoles and funding from ESRC ALSPAC Large Grant
Motivation • Need evidence from different disciplines to answer the research question : how can we improve pupil achievement? • Contribute to multi-disciplinary understanding by comparing common alternative models used by different disciplines
Introduction • Pupils clustered within schools → hierarchical models • Two popular choices: fixed and random effects • Choice of model: • Often driven by discipline tradition – economists use fixed effects for example • May depend on whether primary interest is pupil or school characteristics
Illustrations • What is the impact of SEN status on pupil achievement? • What is the impact of FSM status on pupil achievement?
Why adjust for school effects? • Want to estimate causal effect of SEN on pupil attainment no matter what school they attend • Need to adjust for school differences in SEN labelling • e.g. children with moderate difficulties more likely to be labelled SEN in a high achieving school than in a low achieving school (Keslair et al, 2008; Ofsted, 2004) • May also be differences due to unobserved factors • Hierarchical models can account for such differences • Fixed or random school effects?
Basic model • FE: us is school dummy variable coefficient • RE: us is school level residual • Additional assumption required: E [us|Xis] = 0 • That is, no correlation between unobserved school characteristics and observed pupil characteristics • Both: both models assume: E [eis|Xis] = 0 • That is, no correlation between unobserved pupil characteristics and observed pupil characteristics
Relationship between FE, RE and OLS FE: RE: Where:
How to choose between FE and RE • Very important to consider sources of bias: • Is RE assumption (i.e. E [us|Xis] = 0) likely to hold? • Other issues: • Number of clusters • Sample size within clusters • Rich vs. sparse covariates • Whether variation is within or between clusters • What is the real world consequence of choosing the wrong model?
SEN: Sources of selection • Probability of being SEN may depend on: • Observed school characteristics • e.g. ability distribution, FSM distribution • Unobserved school characteristics • e.g. values/motivation of SEN coordinator • Observed pupil characteristics • e.g. prior ability, FSM status • Unobserved pupil characteristics • e.g. education values and/or motivation of parents
Intuition I • If probability of being labelled SEN depends ONLY on observed school characteristics: • e.g. schools with high FSM/low achieving intake are more or less likely to label a child SEN • Random effects appropriate as RE assumption holds (i.e. unobserved school effects are not correlated with probability of being SEN)
Intuition 2 • If probability of being labelled SEN also depends on unobserved school characteristics: • e.g. SEN coordinator tries to label as many kids SEN as possible, because they attract additional resources • Random effects inappropriate as RE assumption fails (i.e. unobserved school effects are correlated with probability of being SEN) • FE accounts for these unobserved school characteristics, so is more appropriate • Identifies impact of SEN on attainment within schools rather than between schools
Intuition 3 • If probability of being labelled SEN depends on unobserved pupil/parent characteristics: • e.g. some parents may push harder for the label and accompanying additional resources; • alternatively, some parents may not countenance the idea of their kid being labelled SEN • Neither FE nor RE will address the endogeneity problem: • Need to resort to other methods, e.g. IV
Data • Avon Longitudinal Study of Parents and Children (ALSPAC) • Children born in Avon between April 1991 and December 1992 • Rich data • Family background (including education, income, etc) • Medical and genetic information • Clinic testing of cognitive and non-cognitive skills • Linked to National Pupil Database
SEN • One in four pupils in England have SEN age 10 • Just under 4% have statement • In 2003-04, the period relevant to our data, approximately £1.3billion spent on primary school SEN (excluding special schools) • £1,600 per pupil with SEN
SEN • Substantial variation in %SEN across schools • Quarter of schools have fewer than 15% SEN • Quarter with more than 24% SEN • Key question is whether the factors driving differences in % SEN between schools are correlated with unmeasured school-level influences on academic progress
Estimated effects of SEN status on progress between KS1 and KS2
Results from this analysis • SEN negatively correlated with progress between KS1 and KS2 • Choice of model does not seem to matter here • OLS, FE and RE give qualitatively similar results • Correlation between being SEN and unobserved school characteristics not important • Regression and random effects assumptions are likely to hold in this example - prefer the random effects model
Conclusions • Often fixed effects approach is used because RE assumption is a strong one • Efficiency advantages to the RE approach • Failure of the regression assumption is major issue • Approach each problem with agnostic view on model/ may not make a difference • Should be determined by theory and data, not tradition