210 likes | 231 Views
Topic (iii): Macro Editing Methods. Paula Mason and Maria Garcia (USA). UNECE Work Session on Statistical Data Editing Ljubljana, Slovenia, 9-11 May 2011. Topic (iii): Introduction. This topic covers issues concerning macro editing and selective editing Macro editing
E N D
Topic (iii): Macro Editing Methods Paula Mason and Maria Garcia (USA) UNECE Work Session on Statistical Data Editing Ljubljana, Slovenia, 9-11 May 2011
Topic (iii): Introduction This topic covers issues concerning macro editing and selective editing • Macro editing Key Invited paper – Australia Invited papers – Netherlands, New Zealand, Canada (2) • Selective editing Key Invited paper – Spain Invited papers – Sweden, UK
Topic (iii): Introduction • Macro-editing • WP.13 – significance editing framework for macro editing • WP.14 – development of a macro editing tool • WP.15, WP.16, WP.17 – macro editing in an overall editing strategy • Selective editing • WP.18 – theoretical framework for selective editing • WP.19, WP.20 – selective editing using software tools developed in Sweden, and applied by Sweden and the United Kingdom
Topic (iii): Macro Editing Methods Enjoy the presentations!
Topic (iii): Macro Editing Methods Summary of main developments and points for discussion
Macro editing: Main developments • WP.13 (Australia) • Added macro editing strategies to existing significance editing framework • Scores based on predicting impact on outputs • Target macro editing effort at different hierarchical levels • Incorporate sensitivity measures to address swamping and masking
Macro editing: Main developments • WP.14 (Netherlands) • Software for developing custom macro editing tools accessed by scripts • Functionalities include aggregation techniques, data visualization, dynamic filters, data correction and recalculation.
Macro editing: Main developments • WP.15 (New Zealand) • Incorporate macro editing in an overall editing strategy • Increased use of automatic micro edits • Prioritize using expected effects on the outputs • Developed quality indicators • Report efficiency gains
Macro editing: Main developments Canada – Common survey framework for business surveys (two papers) • WP.16 • Iterative process • Rolling estimates model and common editing strategy • Elimination of manual intervention until after estimates are available • Allocation of resources based on macro quality indicators and micro level scores
Macro editing: Main developments (Continued) • WP.17 • Shared, generic corporate strategies, methodologies, and common metadata framework • Methodology for top down approach • Methodology for measuring quality and measures for quality • Score functions to measure impact
Selective editing: Main developments • WP. 18 (Spain) • Theoretical framework for selective editing as an optimization problem • Minimize expected workload subject to minimal expected error on the aggregates • Linear constraints – computationally easier, suitable when timeliness is an issue • Quadratic constraints – wider error bounds, more units are marked for review
Selective editing: Main developments SELEKT tools at both Statistics Sweden and ONS • Scores based on suspicion, potential impact on the outputs • Need “expected” values, final data from previous cycle • WP.19 (Sweden) • Prioritize using expected effects on the outputs • “Expected “ values using time series or cross-sectional data • Different levels of data edited concurrently
Selective editing: Main developments • WP.20 (UK) • Selective editing as part of an overall efficient editing strategy • Assess impact on quality of changes to edit rules prior to using SELEKT • Suspicion based on traditional edit rules or test variables
Points for discussion Using a software tool and/or scores for guiding macro editing operations and/or selective editing has benefits: standardizes review process, can be used for several surveys, and provides overall cost benefits. • How are agencies incorporating cost/resources savings into the survey process? • How are agencies planning on maintaining these tools/systems given the complexities of the metadata, constraints, variable mappings, expectation models, and hierarchies as surveys and output requirements evolve (particularly business surveys)?
Points for discussion (Continued) • What is the effect on other survey activities? • How is the overall macro editing and/or selective editing process contributing to the overall data quality? • How can the effect on data quality be measured?
Points for discussion When macro editing and/or selective editing tools are applied to periodic survey data, subject matter experts may acquire further knowledge about the survey from the macro editing and/or selective editing operations: • How can this knowledge be used to improve the survey process? • How can we incorporate this knowledge to get insight into how to reduce errors and/or enhance micro editing for the next cycle?
Points for discussion In both macro editing and selective editing scores there is the need for estimates of anticipated values. • How to model “expected” values needed for computing measures of suspicion and/or impact? • How do we choose the appropriate domains for computation of “expected” values in order to achieve relevancy and accuracy? • What is the minimum number of observations needed to compute these “expected” values within each domain?
Points for discussion (Continued) • How do we separate model errors for expected values from response errors (for either aggregate expected values or micro expected values) in a production environment? • Are there concerns about potential bias under certain variable distributions that may result from a collection of non-influential units that will not be addressed by selective editing?
Point for discussion Most statistics may benefit from the use of macro editing and/or selective editing. • What are the agencies specifications for a set of general mandatory guidelines?
Point for discussion When designing an overall editing strategy, • To what extent should agencies incorporate selective editing and/or macro-editing in their overall editing strategies? • For what kind of data are these strategies suitable? • How can we take into account the fact final data may be used by other users and for different purposes?
Thank you for your attention! Paula and Maria