1 / 23

Enhancing Survey and Administrative Data Analysis

This annual meeting aims to develop methodologies for utilizing survey and administrative data to improve data analysis. Topics include measuring socioeconomic segregation, assessing ethnic differences in outcomes, and addressing measurement error in ethnicity and English as an Additional Language (EAL) variables.

rhondadavis
Download Presentation

Enhancing Survey and Administrative Data Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. NCRM Annual MeetingJanuary 2009

  2. People

  3. ADMIN • Remit of ADMIN is to develop and disseminate methodologies for making best use of administrative data by exploiting survey data (and vice versa) • Training and capacity building (Mac McDonald)

  4. ADMIN • Strength of administrative data is that they have information on almost everyone. • Weakness is that they are not rich in covariates • NPD has detailed information on educational outcomes but no information on parental education. • We can link richer survey to admin data to enhance the admin data.

  5. ADMIN • Weakness of survey data is non-response and attrition. • Administrative data is virtually (but not fully) complete. • Scope for using linked survey-administrative data to enhance survey data by telling us about those who are missing from survey data.

  6. Aim is to develop methods to.. • make inferences when covariates or responses are missing in administrative data. • use administrative data to overcome measurement error in survey variables (e.g. recalled event histories) and vice versa (e.g. ethnicity) • to tackle bias due to attrition in longitudinal surveys. • using administrative data to improve small-area estimates of the means and quantiles of survey variables.

  7. Programme 1 (Vignoles) • Using survey data to enhance methods for the analysis of administrative data • Measuring the effect of family background and ethnicity on pupil attainment • To what extent does school attendance reflect real school preferences?

  8. Programme 2 (Brown) • Using administrative data to enhance methods for the analysis of survey data • Attrition, non-response and the determinants of school outcomes at 16 • Enhancing event history analysis of social surveys with administrative data

  9. Examples of linked data • Linked administrative data schools data (NPD/PLASC), FE data (ILR) and higher education data (HESA) • Complete administrative data on entire cohort all the way through the education system • NPD/PLASC linked to survey data • LSYPE • MCS

  10. Contribution to work on segregation • Measuring segregation • Socio-economic segregation • Ethnic segregation • Modelling causes of segregation • Parental school choice • Examples drawn from schools but could be applied more broadly

  11. Measuring socio-economic segregation • Currently measured by FSM binary status • Problematic measure (Hobbs and Vignoles, 2008) • Only picks up bottom 16% of distribution at best • Measurement error in FSM status • E.g. children who do not eat at school not recorded as FSM • Changing FSM status in recession

  12. Measuring socio-economic segregation • Linked data has already provided an assessment of the extent to which FSM really proxies socio-economic disadvantage • Can provide alternative measures of socio-economic background from surveys • Parental income/ high low income • Parental education/ high low education • Can assess use of alternative proxies from administrative data e.g. geographic data

  13. Measuring socio-economic segregation • Linked data can test robustness of segregation work that uses FSM • Need to be aware of issues raised by Becky Allen on using samples to measure segregation

  14. Ethnic Minority Project • Originally conceived of as a study of ethnic differences in outcomes • i.e. focusing on missing covariates in model of ethnic achievement • See work by Wilson et. al., 2005 • Data - PLASC/NPD data linked to LSYPE (cohort born 1990/91)

  15. Ethnic Minority Project • Do we get estimates of ethnic differences in outcomes wrong if we just rely on administrative data ? • Do ethnic classifications capture what we are interested in? e.g. example of recent migrants versus long standing populations • What are differences by ethnicity once we take account of language (EAL)?

  16. KS3 Results for Pakistani Males NB: Results show differences in standardized score outcomes NPD controls include gender, ethnicity, age, EAL, SEN, FSM, KS2 score

  17. Measuring ethnicity and EAL • Is there measurement error in the ethnicity or EAL variables in PLASC? • If so, are there implications for measuring ethnic segregation and ethnic differences in outcomes ? • see Aspinall and Jacobson, 2007; Battistin and Sianesi, 2006

  18. Measurement error in ethnicity and EAL • Multiple measures from LSYPE • Ethnicity and ethnic origin self report • Ethnicity and ethnic origin parents • Language spoken at home • Frequency of English spoken at home • Measures from PLASC • Ethnicity • EAL binary indicator

  19. Measurement Error • Misclassification of ethnicity not huge • Sub sample for whom we have full data and who live with both natural parents 7814 • 136 individuals recorded as white British in PLASC but are not according to LSYPE • 57 individuals recorded white British in LSYPE but not in PLASC • Evidence of misclassified EAL • 7.2% young people labelled EAL in PLASC but appear not to be from LSYPE

  20. Ethnicity of those “wrongly” coded EAL in PLASC

  21. Correlations with error EAL/non white sample

  22. Modelling causes of segregation • Linked MCS data with NPD/PLASC • Details of current school, ranked school choices, reasons for school choice • Can provide missing covariates in models of causes of segregation e.g. attitudes to school choice • Currently project investigating school choice in MCS (Burgess, Greaves, Vignoles and Wilson)

  23. Short Courses • Introduction to Data Linkage: • The Value of Data Linkage for Research • Data Linkage – Methodological and Statistical Issues • Enhancing Longitudinal Surveys by Linking to Administrative Data: • Longitudinal Data Analysis • Event History Analysis • Using Longitudinal Data Linkage to Evaluate Area-Based Interventions • Data Linkage with the NPD

More Related