1 / 15

Migration of a large survey onto a micro-economic platform

Migration of a large survey onto a micro-economic platform. Val Cox April 2014. Micro-economic Platform (MEP). Standardises and automates processes - Provides more efficient processing, more analysis Enables Statistics NZ to gain more from available data

gilda
Download Presentation

Migration of a large survey onto a micro-economic platform

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Migration of a large survey onto a micro-economic platform Val Cox April 2014

  2. Micro-economic Platform (MEP) • Standardises and automates processes - Provides more efficient processing, more analysis • Enables Statistics NZ to gain more from available data - Basic principle: use administrative data wherever possible, with surveys filling the gaps - Objective: bring core information about every business in the economy into the Longitudinal Business DB to allow Statistics NZ to respond quickly to changing needs for economic statistics

  3. Aim of paper • To discuss the challenges of building a non-response imputation package for a large survey on the MEP - Rationalises the use of • Banff for outlier detection and imputation • SEVANI (System for Estimation of Variance due to Nonresponse and Imputation) to estimate sampling and non-sampling errors

  4. Annual Enterprise Survey(AES) • Provides statistics on the financial performance and position of New Zealand businesses - Captures about 90% of New Zealand's GDP • Uses four different major data sources • Three administrative (covers 72% of the population) • One postal survey

  5. AES before MEP

  6. Editing strategy of AES on MEP • Guided by the Methodological Standard for E&I • Key objective of standard - Editing is fit-for-purpose and enables continuous improvement of processes and data quality • Key principles used • Automate editing processes where possible • Use Statistics NZ standard editing tools, wherever possible, to achieve standardisation

  7. Editing system of AES in MEP • Uses Banff to automate and standardise editing and imputation processes • Uses analytical views to assess the quality of the edited data

  8. Challenges and solutions A. Sheer volume of data - 28 questionnaires, 113 industries and 180 variables • Solution: Use of a “thin slice” approach • Restrict dataset to one questionnaire and one industry to show all stages of E&I are working • Once successful, expand dataset to include more industries until all 28 questionnaires are replicated • Successful in determining optimal level of automation for correcting failed edits

  9. Challenges and solutions • Determining which variable is erroneous when groups of variables must add or subtract to a total - Banff “errorloc” procedure always recommends to change one variable by a large amount - Change is done by “deterministic” procedure • Solution: Assign weights to variables • Assign lower weights to more reliable variables so Banff doesn’t change their values Examples: totals, gross profit, since respondents use this to determine the tax they pay

  10. Challenges and solutions C. Outlier detection - Old system detects outlier in 3 key variables but unlinks whole unit (all variables) - Banff does univariate outlier detection • Solution: Compared 2 E&I runs of data • 1st run had only the 3 key variables set as outliers and 2nd had all variables included in outlier steps • Decision: Choose variables to be set as outliers based on the effect on the totals

  11. Challenges and solutions • Running imputation one variable at a time would have been very time-consuming • Solution: Group variables • By imputation method (4 methods) • By industry (some industries have different characteristics) • By type of variable (e.g. some variables can be negative)

  12. Challenges and solutions E. Imputation failed for some variables - Some imputation cells were too small • Solution: Merged small imputation cells • Each imputation stage was run twice, the first without cell merging and the second with cell merging, resulting in 8 imputation stages • Use of a “catch-all” stage at the end (9th stage) to carry out mean imputation by industry

  13. Challenges and solutions F. Challenges with no solutions - Analysis of improvements in the E&I was slow as it took several hours to run E&I and write back to the main data storage area to view data in a cube • Attempt to replicate published results as closely as possible created a dilemma: When to stop trying? • What was the “right” answer?

  14. SEVANI • Provided a standardised and automated method to report on estimates of variances due to sampling as well as non-response and imputation • Challenges: - Can produce output for one variable at a time - SEVANI required a lot of parameters to set-up - MEP is unit-based so can’t easily output SEVANI results • Solution: - Use of a macro to identify variable names - Created a SAS code to set-up parameters - Output SEVANI results outside MEP

  15. Next steps • Educate the users of the new system on MEP • Identify potential areas to make improvements in the editing and imputation system • Create a new MEP collection for Charities data to include its own editing and imputation system

More Related