110 likes | 130 Views
Explore the development of a data editing and imputation tool set with desired features, basic functionalities, and future work outlined by Statistics Canada in collaboration with the UN/ECE Work Session on Statistical Data Editing.
E N D
The development of adata editing and imputation tool set UN/ECE Work Session on Statistical Data Editing Topic (ii): Global solutions to editing Claude Poirier Oslo, Norway, 24 – 26 September 2012
Outline • Context • Desired features of a tool set • A basic tool set • Future work Statistics Canada • Statistique Canada
Context • HLG-BAS strategy • From little data to an abundance of data • Increased need for quick statistics • Towards an industrialised environment • Statistical Network • Using EDIMBUS recommended practices • E&I standards and guidelines developed by NSIs Statistics Canada • Statistique Canada
Desired features of a tool set • Functionality • Editing process: on-line edits, flow edits, fatal edits, distribution edits, outlying edits, selective editing, deductive edits, minimum change, processing sub-populations, macro editing • Imputation process: rule-based imputation, deductive imputation, model-based, donor-based, proration, processing sub-populations • Estimation process: Variance due to imputation Statistics Canada • Statistique Canada
Desired features of a tool set (cont’d) • Quality criteria • Relevance: Meets the real needs • Accessibility: Is easy to use • Interpretability: Is easy to understand • Coherence: Offers standardization and interoperability • Accuracy: Produces expected outcome • Timeliness: Meets performance requirements Statistics Canada • Statistique Canada
Desired features of a tool set (cont’d) • Important Software Characteristics • Adaptability: Isolates specific statistical functions • Reliability: Offers robustness and trust • Maintainability: Enables enhancements • Interoperability: Offers the «plug and play» feature Statistics Canada • Statistique Canada
A basic tool set • BANFF • Linear programming to identify the minimum change • Imputation methods: Deductive; Donor; Estimator • CANCEIS • Mixtures of categorical and numerical census data • Minimum change while ensuring plausible imputation • SELEKT • Suspicion level; potential impact; pseudo-bias • Control on the importance of variables Statistics Canada • Statistique Canada
Functionality of the tool set EDITING Banff CANCEIS Selekt On-line edits Construction of groups Editing within groups Fatal edits Distribution edits Outlying edits Selective editing (scores) Deterministic edits Fellegi-Holt (min change) Editing of macro data Graphical editing IMPUTATION Banff CANCEIS Selekt Imputation within groups Rule-base imputation Deterministic imputation Model-based imputation Donor-based imputation Prorating imputation ESTIMATION Banff CANCEIS Selekt Variance due to imputation Statistics Canada • Statistique Canada
Functionality of the tool set EDITING Banff CANCEIS Selekt On-line edits Construction of groups Editing within groups Fatal edits Distribution edits Outlying edits Selective editing (scores) Deterministic edits Fellegi-Holt (min change) Editing of macro data Graphical editing IMPUTATION Banff CANCEIS Selekt Imputation within groups Rule-base imputation Deterministic imputation Model-based imputation Donor-based imputation Prorating imputation ESTIMATION Banff CANCEIS Selekt Variance due to imputation Statistics Canada • Statistique Canada
Future work • To consider other tools • Macro and Graphical editing • To investigate survey platforms • POSS; BESt; Statistics Canada • Statistique Canada
Thank you Merci • For more information, Pour plus d’information,please contact: veuillez contacter : Claude.Poirier@statcan.gc.ca Statistics Canada • Statistique Canada