Imputation in UNECE Statistical Databases: Principles and Practices

Imputation in UNECE Statistical Databases:Principles and Practices Steven Vale and Heinrich Brüngger, UNECE Statistical Division

Contents • The ECOSOC view of statistical imputation • Current practices • Basic principles • Step-by-step implementation • Conclusions and open questions

ECOSOC views • Resolution 2006/6 on strengthening statistical capacity • Sets limits for the use of imputation • ... but also implicitly endorses it as a statistical technique • Statistical agencies need to review their practices to ensure compliance

Defining imputation • “A procedure for entering a value for a specific data item where the response is missing or unusable” • Boundary issues: • Imputing and editing • Imputing and forecasting

Current practice in UNECE • Very limited ad-hoc imputation • Four cases: • Account identities • Regional aggregates • Poor quality national data with little impact on region totals • Re-classification • Using imputations from others • Sufficient transparency in source metadata?

Basic principles (1) • Imputed national data are not published • Avoids the need for consultation • Only official sources used for imputation • Preference for data from same country • Clear distinction between “real” and imputed data • Transparency – imputed data clearly flagged, and methods documented

Basic principles (2) • Aggregates must contain > 90% “real” data, covering > 50% of countries • Imputed data are re-calculated periodically to adjust for revisions • Method used defined at the level of the variable and stored as an attribute • Decisions on the use of imputation to be taken with regard to the quality framework

Step-by-step application • Automatic imputation routines to extend imputation towards the boundaries set by the ECOSOC Resolution • One step at a time, with pause and review to consider quality and cost / benefit • “Dashboard” to allow statisticians to choose the most appropriate method • Implemented in the context of re-engineering of statistical database system

First step • Use a linear trend to impute missing values • Requirements: • Sufficient time series observations (at least 3 out of previous 5 periods) • Closeness of fit of linear trend (R2 close to 1) • Constraints • Validity of R2 for few observations • Forward imputation only

Data Available: Y = Yes N = No Imputation: = Yes = No

Next steps • More flexibility: • Longer time series • Imputing values at start and in middle of time series • Non-linear trends? • Cross-country imputation in strictly limited cases?

Conclusions • Strong links between imputation and quality • Trade-off between accessibility and accuracy • Step-by-step, pause and review approach seems appropriate • Transparency is essential • Standardization of practices between international organizations would help

Open questions • Are other organizations interested in defining a common policy on the use of imputation, in response to the ECOSOC Resolution? • Could we go further and consider harmonization of methods and tools? • How should this be done? Is a specific forum needed, or can this be dealt with in combination with work on data quality? • Have other organizations modified their policies on imputation in the light of the ECOSOC Resolution, and if so, how?

Imputation in UNECE Statistical Databases: Principles and Practices

Imputation in UNECE Statistical Databases: Principles and Practices

Presentation Transcript

Statistical Principles in Dendrochronology

Improving Imputation: The Plan to Examine Count, Status, Vacancy, and Item Imputation in the Decennial Census

Multiple Imputation

Introduction to Emerging Methods for Imputation in Official Statistics

Community Development: Principles and Practices

Workshop on Statistical Organization and Management for SADC Member States Luanda, 2-6 December 2006 --------- by Awa

Statistical Analysis

Edit and Imputation o f the 2011 Abu Dhabi Census Glenn Hui and Hanan AlDarmaki

Costing Principles and Practices-Case Studies and Best Practices

WHI Imputation update

Renewal of Editing Practices at Statistics Finland

Statistical Business Register in the CIS countries

Imputation in the 2011 Census

NSI/ISI Statistical software

Prediction and Imputation in ISEE

Statistical Business Register in the CIS countries

United Nations Economic Commission for Europe Activities in the STES area

Joint UNECE/Eurostat work session on statistical data confidentiality 28 - 30 October 2013

Online Legal, Non-Legal And Statistical Databases

UNECE - Conference of European Statisticians Work Session on Statistical Data Editing

Quality control of the statistical register in the Republic of Belarus

Imputation in the 2011 Census