1 / 21

Improving imputation methodology in the Hungarian Central Statistical Office (HCSO)

Improving imputation methodology in the Hungarian Central Statistical Office (HCSO). Zoltán Csereháti HCSO Methodological Department. 1. Introduction 2. The „IDPS” (Creating Integrated Data Processing System) project 3. Documentation scheme for imputation 4. Training course on imputation

midori
Download Presentation

Improving imputation methodology in the Hungarian Central Statistical Office (HCSO)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Improving imputation methodology in the Hungarian Central Statistical Office (HCSO) Zoltán CserehátiHCSO Methodological Department

  2. 1. Introduction • 2. The „IDPS” (Creating Integrated Data Processing System) project • 3. Documentation scheme for imputation • 4. Training course on imputation • 5. Future work: Handbook on imputation

  3. 1. Introduction (1) • The work of the HCSOMethodological department: • Our scope of processing phases: • Sampling • Estimation • Imputation • Seasonal adjustment • Data confidentiality(list gradually widening)

  4. 1. Introduction (2) • Tools offered: • quality guidelines for the elements of value chain • methodological documentation schemes • good practices • quality indicators • methodological support • training course materials • quality assessing tools • handbook for several phases

  5. General issues related to non-response and imputation (1) • Item / Unit non-response • Non-response bias(Selecting larger samples is not a solution.) • Alternatives: • Reweighting • Imputation

  6. General issues related to non-response and imputation (2) • What is special about imputation: • There is a huge variety of imputation methods. Many of them are quite simple and easy to implement. • Unlike other methodological areas imputation is a processing phase which is often conducted by subject matter statisticians without the supervision of methodologists. • Supposedly many of these methods could be improved.

  7. 2. The "IDPS" (Creating Integrated Data Processing System) project • Objectives (1): • To develop user-friendly integrated data processing system based on standard logic covering the widest range of surveys. • Accessible via a standard user interface and providing a clear and efficient tool for the statisticians. • Include data quality requirements and data processing procedures documented in the meta-database • To be integrated with other general purpose systems such as data entry, dissemination

  8. Objectives (2): • To develop applications or frame systems allowing coordination and quality management in the control of processing • Direct access to data for the purpose of verification and analysis • To restructure the division of labour with the IT staff focusing on innovation, development and production quality data faster through direct data processing • We anticipate having a (partially) working system by the end of 2010.

  9. The organization of the IDPS project • An IT company chosen by a public procurement procedure • On behalf of the HCSO: • IT Department • Methodological Department • Selected subject matter statisticians from all the relevant fields. • Project leadership: • Selected members of the HCSO IT Department • IT company project leaders

  10. 2. IDPS (2) • Benefits: • Common, integrated platform for all the surveys • Less redundancy • More transparent system • Processes documented in a standard way • Better overview of the process plans • System functionalities by the hand of the user • Build new data process flows more easily

  11. 2. IDPS (3) • Main steps already done: • Documentation of the data process flow elements • Designing a general scheme for a universal data processing flow • Identifying process stages such as editing, imputation, outlier filtering, consistency checking, etc • Identifying basic methods currently in use in the different stages. • Identifying process steps from which the individual implementations of the methods are built from.

  12. 2. IDPS (4) Standard processes • We do not want to settle strict methodological standards. • The so-called “standards” of the IDPS system will be optimally designed software components for implementing different algorithms and procedures which are useful as building blocks to compile the IT version of different methods. • How does an ideal standard process look like? • Small and special enough to serve as a building block • Flexible and general enough • Having a number of parameters for fine tuning • As a consequence: • We will face difficult trade-off situations

  13. 3.1 Documentation schemes • Affected methodological areas: • Sampling • Imputation • Estimation and standard error calculation • Seasonal adjustment and confidentiality. • Aims: • to build a uniform structure for assessing • to gain a better overview of the methods used by various surveys • to improve process quality.

  14. 3.2 A documentation scheme for imputation • General information • treatment of item/unit non-response • Imputation method applied • Is there any guideline? • Is the procedure documented? • Place in the processing chain • Software solution used • Auxiliary data sources used • Simple or composite method • Indicate the applied method(s)

  15. 4. Internal training course on imputation (1) • Concept of imputation • Why imputing at all? • Drawbacks and benefits of different methods • How to reduce non-response bias? • Basic weighting techniques • Benefits of complete datasets • How to organize a method building process? • Use of auxiliary information

  16. 4. Internal training course on imputation (2) • Editing and imputation • Basic imputation methods / examples • Documentation: flow charts, algorithmic descriptions • Flagging the imputed values • The place of imputation in the whole data processing flow • Imputation and outlier-filtering • How to plan and assess an imputation method? • Simulation studies

  17. 4. Internal training course on imputation (3) • Teamwork session: • Select a practical problem and try to solve it together in teams • Share the experiences and ideas

  18. 5. Conclusion, future work (1) • Compiling a handbook on imputation (For internal use in the HCSO): • Recommended methods with application areas • Detailed guidelines: how to build an imputation method • Highlighting current best practices • Practical advices, focusing on issues related to Hungarian specialities • Using the experiences of • The work on the IDPS system • The feedbacks from the training course • The information collected by the documentation scheme

  19. 5. Conclusion, future work (2) • International background material including: • ONS paper: „Report on the Task Force on Imputation” • Statistics Canada Quality Guidelines • The results of the • EUREDIT project • EDIMBUS project • Implementing to the special needs of the HCSO • (In the area of seasonal adjustment a similar work has been already finished)

  20. References • The results of the EUREDIT project: http://www.cs.york.ac.uk/euredit/results/results.html • The results of the EDIMBUS project:http://edimbus.istat.it/EDIMBUS1/ • The ONS paper: Report on the Task Force on Imputation (June 1996) GSS Methodology Series • Statistics Canada Quality Guidelines (Fourth Edition 2003) • Quality Guidelines of the HCSO (Legal Act 2007) • Hungarian Central Statistical Office: Strategy 2005-2008, pages 26-27. • Csereháti, Z. (2006) Multiple Donor Imputation Techniques, Paper for the European Conference on Quality in Survey Statistics, Cardiff, 24-26 April 2006

More Related