120 likes | 257 Views
Archiving microdata Standards and good practices. United Nations Statistics Commission New York, February 26, 2009 Olivier Dupriez World Bank, Development Data Group and International Household Survey Network. odupriez@worldbank.org. The value of data. Survey and censuses
E N D
Archiving microdataStandards and good practices United Nations Statistics Commission New York, February 26, 2009 Olivier Dupriez World Bank, Development Data Group and International Household Survey Network odupriez@worldbank.org
The value of data • Survey and censuses • High cost ! High value ? • Data have value beyond the purpose for which they were originally collected (“repurposing” of data) • Large under-exploited potential • Condition: proper archiving • Documentation, dissemination, preservation
Data archiving – Two models • By a specialized data center (“trusted repository”) • (US, Canada, Europe) • Often academic • High level of expertise • Infrastructure • Standards and best practices for documentation • Formal dissemination and preservation policies and procedures • Support to users • By the data producer • (Most developing countries) • Not seen as a key role • Lack of expertise • Inappropriate infrastructure • Ad hoc practices • No compliance with international standards • Unclear policies and procedures
Sharing good practices Objective: transfer data archiving good practices and standards to data producers International Household Survey Network (IHSN) • A network of international agencies (coordinated by World Bank /PARIS21) • Develop tools, guidelines, training materials • Advocates compliance with good practices and international standards www.ihsn.org
Microdata documentation Good documentation is needed to: • Properly analyze the data • Increase credibility of derived indicators and analysis • Allow replication of data collection or analysis • Build institutional memory DDI + Dublin Core metadata standards (XML) A checklist of everything you need to know • Study description • File description • Variable description • Related materials www.ddialliance.org
IHSN DDI Metadata Editor Documenting the study: sampling, data collection, scope and coverage, etc.
IHSN DDI Metadata Editor Documenting files and variables: formulation of question, interviewer’s instructions, computation of variables, etc.
IHSN DDI Metadata Editor Metadata in XML format … … can be “transformed” into html, pdf, other
Microdata cataloguing XML/DDI metadata is web-ready, “browsable and searchable”
Microdata dissemination • Growing demand for microdata • Potential to add much value to existing data • But requires: • Enabling legislation • Formal policy/procedures (IHSN guidelines) • Technical capacity to prepare data for dissemination • Documenting, cataloguing • Anonymizing (IHSN tools being tested)
Data and metadata preservation Situation in many countries: documents in hard copy only, outdated storage media, multiple versions of datasets, much information lost (or never generated). Goal: Data and documentation remain readable, meaningful, understandable, accessible manage hardware, software and storage media (not only backups; also “migration”) On-going: IHSN-ICPSR guidelines (Open Archival Information System - OAIS; ISO 14721)
Conclusions and recommendations • NSOs do not need to have all features of advanced data centers, but data archive is part of their mandate • Documentation and preservation are a MUST, even if you don’t disseminate • Good practices and standards are relatively easy to implement • Good documentation of past surveys helps improve the quality of future surveys