1 / 11

The development of a data editing and imputation tool set

Explore the development of a data editing and imputation tool set with desired features, basic functionalities, and future work outlined by Statistics Canada in collaboration with the UN/ECE Work Session on Statistical Data Editing.

billa
Download Presentation

The development of a data editing and imputation tool set

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The development of adata editing and imputation tool set UN/ECE Work Session on Statistical Data Editing Topic (ii): Global solutions to editing Claude Poirier Oslo, Norway, 24 – 26 September 2012

  2. Outline • Context • Desired features of a tool set • A basic tool set • Future work Statistics Canada • Statistique Canada

  3. Context • HLG-BAS strategy • From little data to an abundance of data • Increased need for quick statistics • Towards an industrialised environment • Statistical Network • Using EDIMBUS recommended practices • E&I standards and guidelines developed by NSIs Statistics Canada • Statistique Canada

  4. Desired features of a tool set • Functionality • Editing process: on-line edits, flow edits, fatal edits, distribution edits, outlying edits, selective editing, deductive edits, minimum change, processing sub-populations, macro editing • Imputation process: rule-based imputation, deductive imputation, model-based, donor-based, proration, processing sub-populations • Estimation process: Variance due to imputation Statistics Canada • Statistique Canada

  5. Desired features of a tool set (cont’d) • Quality criteria • Relevance: Meets the real needs • Accessibility: Is easy to use • Interpretability: Is easy to understand • Coherence: Offers standardization and interoperability • Accuracy: Produces expected outcome • Timeliness: Meets performance requirements Statistics Canada • Statistique Canada

  6. Desired features of a tool set (cont’d) • Important Software Characteristics • Adaptability: Isolates specific statistical functions • Reliability: Offers robustness and trust • Maintainability: Enables enhancements • Interoperability: Offers the «plug and play» feature Statistics Canada • Statistique Canada

  7. A basic tool set • BANFF • Linear programming to identify the minimum change • Imputation methods: Deductive; Donor; Estimator • CANCEIS • Mixtures of categorical and numerical census data • Minimum change while ensuring plausible imputation • SELEKT • Suspicion level; potential impact; pseudo-bias • Control on the importance of variables Statistics Canada • Statistique Canada

  8. Functionality of the tool set EDITING Banff CANCEIS Selekt On-line edits Construction of groups  Editing within groups  Fatal edits  Distribution edits  Outlying edits  Selective editing (scores)  Deterministic edits  Fellegi-Holt (min change)  Editing of macro data Graphical editing IMPUTATION Banff CANCEIS Selekt Imputation within groups  Rule-base imputation  Deterministic imputation  Model-based imputation  Donor-based imputation  Prorating imputation  ESTIMATION Banff CANCEIS Selekt Variance due to imputation Statistics Canada • Statistique Canada

  9. Functionality of the tool set EDITING Banff CANCEIS Selekt On-line edits Construction of groups  Editing within groups  Fatal edits  Distribution edits  Outlying edits  Selective editing (scores)  Deterministic edits  Fellegi-Holt (min change)  Editing of macro data Graphical editing IMPUTATION Banff CANCEIS Selekt Imputation within groups  Rule-base imputation  Deterministic imputation  Model-based imputation  Donor-based imputation  Prorating imputation  ESTIMATION Banff CANCEIS Selekt Variance due to imputation Statistics Canada • Statistique Canada

  10. Future work • To consider other tools • Macro and Graphical editing • To investigate survey platforms • POSS; BESt; Statistics Canada • Statistique Canada

  11. Thank you Merci • For more information, Pour plus d’information,please contact: veuillez contacter : Claude.Poirier@statcan.gc.ca Statistics Canada • Statistique Canada

More Related