The Choice Between Fixed and Random Effects Models: Some Considerations For Educational Research

The Choice Between Fixed and Random Effects Models: Some Considerations For Educational Research Clarke, Crawford, Steele and Vignoles and funding from ESRC ALSPAC Large Grant

Motivation • Need evidence from different disciplines to answer the research question : how can we improve pupil achievement? • Contribute to multi-disciplinary understanding by comparing common alternative models used by different disciplines

Introduction • Pupils clustered within schools → hierarchical models • Two popular choices: fixed and random effects • Choice of model: • Often driven by discipline tradition – economists use fixed effects for example • May depend on whether primary interest is pupil or school characteristics

Illustrations • What is the impact of SEN status on pupil achievement? • What is the impact of FSM status on pupil achievement?

Why adjust for school effects? • Want to estimate causal effect of SEN on pupil attainment no matter what school they attend • Need to adjust for school differences in SEN labelling • e.g. children with moderate difficulties more likely to be labelled SEN in a high achieving school than in a low achieving school (Keslair et al, 2008; Ofsted, 2004) • May also be differences due to unobserved factors • Hierarchical models can account for such differences • Fixed or random school effects?

Basic model • FE: us is school dummy variable coefficient • RE: us is school level residual • Additional assumption required: E [us|Xis] = 0 • That is, no correlation between unobserved school characteristics and observed pupil characteristics • Both: both models assume: E [eis|Xis] = 0 • That is, no correlation between unobserved pupil characteristics and observed pupil characteristics

Relationship between FE, RE and OLS FE: RE: Where:

How to choose between FE and RE • Very important to consider sources of bias: • Is RE assumption (i.e. E [us|Xis] = 0) likely to hold? • Other issues: • Number of clusters • Sample size within clusters • Rich vs. sparse covariates • Whether variation is within or between clusters • What is the real world consequence of choosing the wrong model?

SEN: Sources of selection • Probability of being SEN may depend on: • Observed school characteristics • e.g. ability distribution, FSM distribution • Unobserved school characteristics • e.g. values/motivation of SEN coordinator • Observed pupil characteristics • e.g. prior ability, FSM status • Unobserved pupil characteristics • e.g. education values and/or motivation of parents

Intuition I • If probability of being labelled SEN depends ONLY on observed school characteristics: • e.g. schools with high FSM/low achieving intake are more or less likely to label a child SEN • Random effects appropriate as RE assumption holds (i.e. unobserved school effects are not correlated with probability of being SEN)

Intuition 2 • If probability of being labelled SEN also depends on unobserved school characteristics: • e.g. SEN coordinator tries to label as many kids SEN as possible, because they attract additional resources • Random effects inappropriate as RE assumption fails (i.e. unobserved school effects are correlated with probability of being SEN) • FE accounts for these unobserved school characteristics, so is more appropriate • Identifies impact of SEN on attainment within schools rather than between schools

Intuition 3 • If probability of being labelled SEN depends on unobserved pupil/parent characteristics: • e.g. some parents may push harder for the label and accompanying additional resources; • alternatively, some parents may not countenance the idea of their kid being labelled SEN • Neither FE nor RE will address the endogeneity problem: • Need to resort to other methods, e.g. IV

Data • Avon Longitudinal Study of Parents and Children (ALSPAC) • Children born in Avon between April 1991 and December 1992 • Rich data • Family background (including education, income, etc) • Medical and genetic information • Clinic testing of cognitive and non-cognitive skills • Linked to National Pupil Database

SEN • One in four pupils in England have SEN age 10 • Just under 4% have statement • In 2003-04, the period relevant to our data, approximately £1.3billion spent on primary school SEN (excluding special schools) • £1,600 per pupil with SEN

SEN • Substantial variation in %SEN across schools • Quarter of schools have fewer than 15% SEN • Quarter with more than 24% SEN • Key question is whether the factors driving differences in % SEN between schools are correlated with unmeasured school-level influences on academic progress

Estimated effects of SEN status on progress between KS1 and KS2

Results from this analysis • SEN negatively correlated with progress between KS1 and KS2 • Choice of model does not seem to matter here • OLS, FE and RE give qualitatively similar results • Correlation between being SEN and unobserved school characteristics not important • Regression and random effects assumptions are likely to hold in this example - prefer the random effects model

Conclusions • Often fixed effects approach is used because RE assumption is a strong one • Efficiency advantages to the RE approach • Failure of the regression assumption is major issue • Approach each problem with agnostic view on model/ may not make a difference • Should be determined by theory and data, not tradition

The Choice Between Fixed and Random Effects Models: Some Considerations For Educational Research

The Choice Between Fixed and Random Effects Models: Some Considerations For Educational Research

Presentation Transcript

Lecture 5 “additional notes on crossed random effects models”

Econometric Analysis of Panel Data

Difference in Difference Models

Repeated Measures, Part 2

Difference in Difference Models

Two-way fixed-effect models Difference in difference

Linear Hierarchical Models

Fixed Effects Estimation

Considerations when Using RTI Models with Culturally and Linguistically Diverse Students

Fixed vs. Random Effects

11. Categorical choice and survival models

Error Component Models

Petrale STAR Day 1 Requests

Multilevel Data in Outcomes Research

Topic 30: Random Effects

Factorial Models

Random Effects Model

Random Effects Graphical Models and the Analysis of Compositional Data

Chapter 15

Difference in Difference Models

3. Models with Random Effects