140 likes | 160 Views
Towards the 2011 UK Census Editing Strategy. Heather Wagstaff and Steven Rogers Methodology Directorate Office for National Statistics, U.K. Overview. The presentation is structured as follows: Overview of edit & imputation in UK Census CANCEIS: fitness for 2011 UK Census
E N D
Towards the 2011 UK Census Editing Strategy Heather Wagstaff and Steven Rogers Methodology Directorate Office for National Statistics, U.K.
Overview The presentation is structured as follows: • Overview of edit & imputation in UK Census • CANCEIS: fitness for 2011 UK Census • Development of 2011 Census Edit Strategy • Summary
Overview of UK Census Edit & Imputation • EDIS hard coded leading to flexibility problems: • UK countries all had slightly differing requirements; • 1999 Rehearsal Data too late for system testing; • problems during live running: • late changes to question set not tested; • complex filter questions not followed by number of respondents.
Statistical Modernisation Programme • Main Focus: to deliver standard statistical infrastructure, methodologies and tools; • Main Aim: to apply recognised standards and practices in highly efficient way. • CANCEIS endorsed as corporate edit and imputation tool: • implement where data are mainly nominal; • implemented on household surveys; and Civil Registration • now endorsed for 2011 Census.
CANCEIS • CANadian Census Edit and Imputation System • generalised edit and imputation system • specify edits in decision logic tables • nearest neighbour imputation methodology • computationally efficient • simultaneous imputation of numeric and categorical variables
CANCEIS - ensuring fitness for 2011 Census • 2001 UK Census • processed about 27 million forms in relation to circa. 60 million people. • Demonstrate evidence of robustness for 2011 Census: • provide proof of concept; • replicate 2001 Census Editing Strategy; • recover statistical properties of data.
CANCEIS - ensuring fitness for 2011 Census • Stage 1: Provide proof of concept • purpose: to access whether CANCEIS will produce complete and consistent census data • UK Census data processed by Administration Areas • convert edit rules to DLT’s • apply CANCEIS to census data • replicate in SAS to QA edit process • outcome: CANCEIS produced complete and consistent dataset in under 2 hours.
CANCEIS - ensuring fitness for 2011 Census • Stage 2: Replicate 2001 Census Editing Strategy • purpose: to assess range of functionality
CANCEIS - ensuring fitness for 2011 Census • Stage 3: Recover statistical properties of the data • purpose: to assess whether CANCEIS contained an imputation process of acceptable quality; • micro-simulation environment; • 170K households and 400K individuals; • stochastic process - apply CANCEIS in multiple runs; • measure distributional and predictive accuracy.
CANCEIS - ensuring fitness for 2011 Census • Step 3: Recover statistical properties of the data
Towards the 2011 Census Edit Strategy • Develop in two parts integral to Census Quality Strategy: • Part 1: Specification of comprehensive and cohesive edit rules • Part 2: Research imputation methodology including • partitioning person variables • large household sizes • Communal Establishments (collectives) • differing area types
Towards the 2011 Census Edit Strategy • Example of importance of single cohesive set of edit rules: • dependence of filter rules on 100% accuracy of date of birth at data capture.
Towards the 2011 Census Edit Strategy • Example of methodology: partitioning person variables: • 2001 Census question set partitioned in 6 main topics • Labour Market contained 4 subsets • 2011 question set currently unendorsed
Concluding Remarks • Benefits of applying a generalised system (CANCEIS) in the 2011 UK Census include: • significant cost savings and efficiency gains (inc. time to process large datasets); • flexibility and transparency; • allow time and resource to address difficult methodological issues.