150 likes | 260 Views
Testing the Importance of Cleansing Procedures for Overlaps in German Administrative Data. Patrycja Scioch (Research Data Centre of the BA at the IAB, Germany). New Techniques and Technologies for Statistics, 18.-20.2.2009. Motivation.
E N D
Testing the Importance of Cleansing Procedures for Overlaps in German Administrative Data. Patrycja Scioch (Research Data Centre of the BA at the IAB, Germany) New Techniques and Technologies for Statistics, 18.-20.2.2009
Motivation • increasing importance of using administrative data for research • in Germany we have two types of such data: • collected for official statistical purposes • by-product of administration (e.g. federal employment services) • administrative data: • not collected for research • different and independent sources of data • merging may cause contradictions in information
The Integrated Employment Biographies - IEB • combination of four different sources: • Employee History • Benefit Recipient History • Applicants Pool Data • Participants in Measure Dataset • subsample: • 2.2% random sample • latest update 2006 • characteristics: • daily records • splitted into episodes • quality depends on source of information
Literature • previous findings: • concentrate on the analysis of overlaps - qualitative and quantitative • (Jaenichen et. al (2005), Bernhard et. al (2006)) • correction of single variables (Waller, M. (2007), Kruppe et. al (2007)) evidence: • need for data processing in the IEB • the way heavily depends on the research question open issues: • impact on estimates • data processing by transformation of structure of dataset
Identification/Method • assumptions: dataset → processing → method → result • within the Case: Wunsch/Lechner (2007) • evaluation of labour market programmes in West Germany • analyses by comparing matching-estimates • time-dependent employment opportunities as outcome • step: replication of the data processing and variations of the analysis sample • step: replication of the evaluation study • 3. step: analyses of the effects of the variations on the results
Approach/Framework analysis- sample V0 outcome V0 IEB - data set analysis- sample V1 outcome V1 Outcome ? analysis- sample V2 outcome V2 Processing - variable ‚Matching-estimatior‘ - fix Comparison
Processing rules • time windows of two weeks • multiple possibilities of spells (different sources, overlaps) • goal: exact one state for each period • Sort by duration and priority of source • Choose the two with capital importance • Select one final state using more priority-rules • different analysis samples
Rules of Priority • Differences: • Model V1 prefers employment-spells to benefit-spells compared to V0 • Model V2 downgrades participation in programmes and prefers employment
Results before starting the estimation programme – benefit – employment – applicant analysis- sample V0 programme – employment – benefit – applicant IEB- data set analysis- sample V1 employment – programme – benefit – applicant analysis- sample V2
Descriptive results • Participants: • differences between sample V0 and V1, V2 • different magnitudes • insignificant • Group of Non-Participants: • significant differences • not of practical importance
Estimation results - 1 Effects of programme participation compared to non-participation 11
Estimation results - 2 Variance in the estimation results 12
Summary/Prospects • large insignificant differences during lock-in-effect • smaller at the end of observation period • => The Effect does not depend on the procedure (only the extent)! • => Rules are necessary, but time + effort should not exceed benefit! • creation of a “naive”-model • comparison with other countries
Thank you for your attention! Patrycja.Scioch@iab.de http://fdz.iab.de/
Back-Up References Bernhard, S., Dressel, C., Fitzenberger, B. und Schnitzlein, D. (2006): Überschneidungen in der IEBS: Deskriptive Auswertung und Interpretation, FDZ Methodenreport 4/2006, Nürnberg. Jaenichen, U., Kruppe, T., Stephan, G., Ullrich, B. und Wießner, F. (2005): You can split it if you really want: Korrekturvorschläge für ausgewählte Inkonsistenzen in IEB und MTG, FDZ Datenreport 4/2005, Nürnberg. Kruppe, T., Müller, E., Wichert, L. und Wilke, R. (2007): On the Definition of Unemployment and ist Implementation in Register Data – The Case of Germany, FDZ Methodenreport 3/2007, Nürnberg. Waller, M. (2007): Do Reported End Dates of Treatments Matter for Evaluation Results?, FDZ Methodenreport 1/2007, Nürnberg. Wunsch, C. und Lechner, M. (2007): What Did All the Money Do? On the General Ineffectiveness of Recent West German Labour Market Programmes, University of St. Gallen Department of Economics working paper series 2007 2007-19, Department of Economics, University of St. Gallen.