290 likes | 308 Views
Explore Statistics Canada's unified enterprise surveys, integrated business statistics program, and active collection strategies to enhance data quality and reduce costs. Learn about score functions, quality indicators, and response propensity models.
E N D
An Active Collection using Intermediate Estimates to Manage Follow-Up of Non-Response and Measurement Errors Jeannine Claveau, Serge Godbout and Claude Turmelle Statistics Canada International Total Survey Error Workshop Québec, June 20, 2011
Outline Statistics Canada • Statistique Canada Statistics Canada • Statistique Canada • Introduction • Quality Indicators (QI) • Measure of Impact (MI) Scores • Future Work 2020-01-04 2
Unified Enterprise Surveys (UES) Statistics Canada • Statistique Canada Statistics Canada • Statistique Canada • UES consists of 58 annual business surveys integrated in terms of content, collection and data processing • Collect information on enterprise financial variables • Collection period: February to early October • Telephone pre-contact used for new units in the sample • Mail questionnaires for initial data gathering • Telephone follow-up conducted to collect data from non-respondent and to resolve failed edits 2020-01-04 3
Unified Enterprise Surveys (UES) Statistics Canada • Statistique Canada Statistics Canada • Statistique Canada • Score function is used to prioritize telephone follow-up for non-response • Score based on weighted sampling revenue • For most of the UES surveys: no score function used for failed edits follow-up • Collection Processing System: Blaise • Paradata in Blaise Transaction History files 2020-01-04 4
Integrated Business Statistics Program (IBSP) Statistics Canada • Statistique Canada Statistics Canada • Statistique Canada • IBSP is under development to redesign and expand UES to integrate other enterprise surveys and sub-annual surveys • Goal: • Reduce operating costs • Enhance quality assurance • IBSP will integrate 120 surveys by 2016 (phase 1: 2014) • Electronic questionnaire (electronic data collection) will be the principal collection mode offered to enterprise 2020-01-04 5
Current UES – Processing Model Sampling Collection Processing Analysis Dissemination Statistics Canada • Statistique Canada Collection, processing and analysis are run sequentially Estimates produced at very end only Collection ends at set date 2020-01-04
IBSP – Estimates Model Collection Sampling Dissemination Processing Analysis Statistics Canada • Statistique Canada • Collection, processing and analysis will be run in parallel • Estimates will be produced and re-run periodically • Collection could end earlier when pre-specified quality target has been met 2020-01-04 7
Active Collection Statistics Canada • Statistique Canada Statistics Canada • Statistique Canada Role: Manage follow-up of non-response and measurement errors (failed edits) Responsive Design (Laflamme and Karaganis, 2010) or Dynamic adaptative approach (Schouten, Calinescu and Luiten, 2011) that uses data available during collection to modify collection strategy Estimates and quality indicators will be produced periodically throughout collection: e.g. monthly basis Then scores measuring impact on estimates and on quality indicators are calculated to allocate and prioritize telephone follow-up 2020-01-04 8
Basic Collection Strategy Initial Sample S Production of Intermediate Estimates Successive Designs d1 d2 di-2 di-1 d0 NR2 NR1 NR3 NRi NRi-1 Observed NR and Response R2 R1 R3 Ri-1 Ri Statistics Canada • Statistique Canada
Parameter and Estimator Statistics Canada • Statistique Canada Statistics Canada • Statistique Canada • Variables of interest: Set of I key variables • Parameters of interest: • Stratified expansion estimators: • Sampling variances:(under a stratified Bernoulli design): Where i, k and h identify respectively the I variables, the Nh units and the H strataNh = stratum population sizeph = unit sampling probability within stratumnh = the stratum sample size 2020-01-04 10
Non-Response Statistics Canada • Statistique Canada Statistics Canada • Statistique Canada • Response propensity model: • Auxiliary data and paradata would be used to estimate response propensities • Estimation: • In case on non-response, we will either use imputation or reweighting to account for missing data • Response propensities could be used to form imputation or reweighting homogeneous classes for reducing the non-response bias (Haziza and Beaumont, 2007) Stratified expansion estimators: 2020-01-04 11
Quality Indicators (QI) Statistics Canada • Statistique Canada Statistics Canada • Statistique Canada • Role: • Monitor collection progress • Help to allocate and prioritize collection efforts • Can be item-based • Specific to a variable of interest • Variance, CV • Item response rate of a variable of interest • Bias: • MSE: 2020-01-04 12
Quality Indicators (QI) Statistics Canada • Statistique Canada Statistics Canada • Statistique Canada • Can be covariate-based • Derived from statistics on the estimated response propensities given the covariates X • Independent from the variables of interest • Examples of covariate-based QIs (Schouten, 2011) : • Mean response propensity: • R-indicator: • Standardized Maximal Bias: • Standardized Maximal Variance: • Standardized Maximal MSE: 2020-01-04 13
Measure of Impact (MI) Scores Statistics Canada • Statistique Canada Statistics Canada • Statistique Canada • Types of Scores • Common types: Edit-related and estimate-related score functions • Example: Predicted difference in estimates (Hedlin, 2008) • Proposal:Generalize the MI Scoreto include quality-relatedscore functions • For an estimated parameter (estimate or quality indicator) • Definition: • Where is the estimated parameter after changing reported values and/or covariates of unit k respectively to and/or and is a scaling factor 2020-01-04 14
Measure of Impact (MI) Scores Statistics Canada • Statistique Canada Statistics Canada • Statistique Canada • MI Score for an estimated total: • Requires predicted values to compare to reported values • Proposal: Use imputation to obtain predicted values • Used to prioritize units for failed edit follow-up 2020-01-04 15
Measure of Impact (MI) Scores Statistics Canada • Statistique Canada Statistics Canada • Statistique Canada • MI Score for item-based quality indicators • MI Score for estimated sampling variance for expansion estimators • Specific to a variable of interest • Also use imputation to obtain predicted values • Linked directly to quality of output estimates • Prioritize units for failed edit follow-up 2020-01-04 16
Measure of Impact (MI) Scores Statistics Canada • Statistique Canada Statistics Canada • Statistique Canada MI Score for item-based quality indicator MI Score for covariate-based quality indicator Used to prioritize units for both non-response and failed edit follow-up 2020-01-04 17
Active Collection Management Statistics Canada • Statistique Canada Statistics Canada • Statistique Canada • A large number of variables to monitor • Monitoring all of them will be a challenge • Not all equally important • Identify a limited number of key variables • For each key variable • Quality monitored using item-based QIs and MI Scores • For the non-key variables • Quality controlled using covariate-based QIs 2020-01-04 18
Active Collection Management Statistics Canada • Statistique Canada Statistics Canada • Statistique Canada • MI scores for each estimated parameter and quality indicator are considered local scores • In order to prioritize units for telephone follow-up, global score per unit is needed • Derive global MI Score (Hedlin, 2008) • Sum, maximum or Euclidian distance could be used • Some QIs are appropriate for evaluating the impact of non-response and others for the impact of edit failures • Derive one global score for non-response follow-up and one global score for failed edit follow-up 2020-01-04 19
Control Quality with Covariated-Based QIs Statistics Canada • Statistique Canada Statistics Canada • Statistique Canada 2020-01-04 • Goal: Increase the average of the response propensities while improving their homogeneity. 20
Control Quality with Covariated-Based QIs Statistics Canada • Statistique Canada Statistics Canada • Statistique Canada 2020-01-04 • Goal: Increase the average of the response propensities while improving their homogeneity. 21
Control Quality with Covariated-Based QIs Statistics Canada • Statistique Canada Statistics Canada • Statistique Canada 2020-01-04 • Goal: Increase the average of the response propensities while improving their homogeneity. 22
Summary Statistics Canada • Statistique Canada Statistics Canada • Statistique Canada 2020-01-04 23
Summary Statistics Canada • Statistique Canada Statistics Canada • Statistique Canada 2020-01-04 24
Summary Statistics Canada • Statistique Canada Statistics Canada • Statistique Canada 2020-01-04 25
Summary Statistics Canada • Statistique Canada Statistics Canada • Statistique Canada 2020-01-04 26
Future Work Statistics Canada • Statistique Canada Statistics Canada • Statistique Canada • Methodology development • Response propensity model: development of a model based on data and paradata • Item-based and covariate-based QIs • Validation of the proposed strategy • Conduct simulation studies and develop prototypes using current UES environment • Summer 2011 prototype: response rates, imputation rate, CV and MI scores • Next prototype: Other local and global MI scores and QIs 2020-01-04 27
Discussion Statistics Canada • Statistique Canada Statistics Canada • Statistique Canada What quality indicators are appropriate to measure the risks of potential bias in the estimates? What is the best way to use quality indicator (e.g. R-indicator) to monitor collection of highly skewed business surveys? The proposed approach obviously affects the response propensities throughout collection. Although we can adjust the estimator later on to take this into account, is it something we should move away from? Or should we take advantage of it? In the proposed approach, are there any additional aspects that should be considered? 2020-01-04 28
Merci / Thank You • For more information, Pour plus d’information, please contact: veuillez contacter : Jeannine Claveau jeannine.claveau@statcan.gc.ca Serge Godbout serge.godbout@statcan.gc.ca Claude Turmelle claude.turmelle@statcan.gc.ca Statistics Canada • Statistique Canada