110 likes | 245 Views
New Techniques and Technologies for Statistics - 2009 Brussels, 18 - 20 February 2009. Special Session on Access to Microdata "An informational infrastructure for the E-Science Age - On the way to remote data access for business data ". Maurice Brandt Federal Statistical Office Germany
E N D
New Techniques and Technologies for Statistics - 2009 Brussels, 18 - 20 February 2009 Special Session on Access to Microdata"An informational infrastructure for the E-Science Age -On the way to remote data access for business data " • Maurice Brandt • Federal Statistical Office Germany • Research Data Centre
Overview • Introduction • 2. Current situation at the research data centres • 3. Content of the project “InfinitE” • 4. Production of data structure files • 5. Result-based confidentiality • 6. Summary
1. Introduction • Development in (business-) microdata request goes to microdata without data perturbing methods • Ideally original microdata more and more researcher ask for remote data execution or safe centre • This leads to a huge amount of tables, which have to be checked for confidentiality • The development on a national level will propably also happen on EU level • The researchers require more data preferably non anonymised microdata
2. Current situation in the RDC‘s • output checking: • right now the output of the researcher is checked by two persons (4 eyes principle) • only publication of absolute anonymous tables allowed • construction of combined and integrated datasets for business microdata difficult to anonymise
2. Current situation in the RDC‘s • Why this project: -still reservations from science concerning the data perturbing methods for economic microdata - amount of work of manual output checking - increasing request for original microdata
3. Content of the Project „InfinitE“ • “An informational infrastructure for the E-Science Age -On the way to remote data access for business data” • deals with the improvement of remote access in the Federal Statistical Office Germany • project aims to find solutions for a better remote access in Germany through so called data structure files and (automatic) output checking procedures • data structure files: - goal: semantic and syntactic correct data structure files - application to original data without any adaptations
4. Production of data structure files • Methods to produce data structure files: - stochastic noise - multidimensional microaggregation - sythetic data multiple imputation • Test of confidentiality and measurement of reidentification risk - Development of new procedures to measure reidentification risk of syntetic data Joerg Drechsler: „Disclosure Control in Business Data” on this conference • Judgement about utility and applicability of data structure files
5. Result-based confidentiality • output checking procedures • Classification of outputs in „safe“ and „unsafe“ output • Identification of output where anonymisiation procedures are necessary • Evaluation and development of practicable anonymisation methods for „unsafe output“ • The project evaluates also the analytical validity of the anonymised output
5. Result-based confidentiality • Confidentiality methods for tables and (regression) output - (rounding, controlled tabular adjustment, stochastic noise) - evaluation of automatic output checking procedures • feasibility study to change the legal frame for researcher to publish tables - More responsibility to the researcher - This leads to less anonymisation and suppression in the output
6. Summary • change is observable in user needs and requirements on microdata access • with this national project the data infrastructure in Germany is going to improve to consider these developments • time for change in remote data execution procedure - otherwise the amount of output is not manageable anymore • National and ESSnet projects can benefit from each other
Thank you for your attention Maurice Brandt Research Data Centre Federal Statistical Office Germany Tel. +49 611/75 4349 maurice.brandt@destatis.de http://www.forschungsdatenzentrum.de http://www.destatis.de