150 likes | 170 Views
Presentation on methodological issues when deriving educational attainment from various data sources. Discusses the use of administrative data for official statistics, quality concerns, and micro-integration processes for harmonization, corrections, and deriving data. Concludes with insights on the complexities of combining data sources.
E N D
Could that be true?Methodological issues when deriving educational attainment from different administrative datasources and surveys Presentation for the IAOS Conference on Reshaping Official Statistics Shanghai, October 14-16, 2008 Bart F.M. Bakker Manager Section Socio-Economic State Statistics Netherlands
The problem • Increasing use of administrative data for official statistics, because: • lower costs • smaller response burden • covering all elements of the population for small domain statististics • Surveys only additional • The problem: • unknown or poor quality of part of the administrative data • unknown or poor quality of statistical outcomes if administrative sources are combined
General idea • Administrative data are collected with one or more traditional survey techniques, so: • they have the same errors • as traditional surveys • The size of the errors depends on the audits the register keeper execute • Variables that are important to the register keeper are assumed to be of better quality
An example: educational attainment • The goal of the project • Determining the educational attainment of as many persons as possible • that can be used to derive a background variable for all kinds of research • and, if the validity is reasonable, can be used for the estimation of the educational attainment in small areas and small subgroups • not one register available
Sources • CRIHO: students in higher education from 1986 • ERR: students who did an exam in general secondary education from 1999 • Education Number Registers: students in secondary general education from 2004 • CWI: job-seekers who are registered as such in the employment exchange from 1990 • WSF: students with student grants from 1999 • LFS: 1% samples from the population aged >15 from 1996
Micro-integration: harmonisation • Determine the classification of educational attainment • Harmonise the copied information on the training programmes • Derive the classification • Derive information whether certificates are attained • The date that the certificates are attained
Micro-integration: correction for measurement errors • Is the educational attainment valid at the reference date? • Border that the probability is <5% that someone will attain a higher level • 2. Probability <5% that someone has attained a higher level since the latest certificate is attained • Both empirically determined with the use of life tables
Micro-integration: correction for measurement errors • For one person on one reference date more than one valid score on educational attainment is available • Choose the source with the best quality: • CRIHO, Education Number Register, ERR • LFS • WSF CWI only for weighting
Derive educational attainment • Derive the highest educational level attained from: • all followed training programmes before reference date • the certificates that are attained before reference date • validity on reference date • choose source with best quality • downgrade the followed training programmes not ended with a certificate • impute with the use of age <15 years
100% 90% 80% register 15+ 70% 60% 50% LFS 15+ 40% coverage 30% PR 0-14 + register 15+ 20% + LFS 15+ 10% 0% 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 Results: coverage age
Weighting the data • Coverage shows selectivity • underrepresentation of vocational education on secondary level • overrepresentation of youngsters • Weight to the population, result in two vectors • the valid scores on educational attainment on reference date and • a weight
Conclusions • Administrative data have the same errors as traditional surveys • And some more… • Combining data from registers and surveys is promising • But complicated • Always do research on the quality of the administrative data