1 / 29

Integrating Survey and Administrative Records to Reduce Respondent Burden and Data Collection Costs

This article explores the use of administrative records in official statistics, specifically focusing on the integration of survey and administrative record data to improve the balance of quality, cost, burden, and risk.

feliz
Download Presentation

Integrating Survey and Administrative Records to Reduce Respondent Burden and Data Collection Costs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Two Approaches to the Use of Administrative Records to Reduce Respondent Burden and Data Collection Costs John L. Eltinge Office of Survey Methods Research U.S. Bureau of Labor Statistics 12th Meeting of the Group of Experts on Business Registers September 14-15, 2011

  2. Acknowldegements and Disclaimer The author thanks Tony Barkume, Rick Clayton, Mike Davern, Bob Fay, Jenna Fulton, Gerry Gates, Pat Getz, Bill Iwig, Shelly Martinez, Bill Mockovak, Polly Phipps, John Ruser, and members of the FCSM Subcommittee on Administrative Records for many helpful discussions of the topics considered here. The views expressed here are those of the author and do not necessarily reflect the policies of the U.S. Bureau of Labor Statistics, nor the FCSM Subcommittee on Statistical Uses of Administrative Records

  3. Overview I. Conceptual Background II. Two Approaches to Integration of Survey and Administrative Record Data A. Survey Core B. Administrative Record Core III. Methodological Issues IV. Empirical Issues V. Management Issues

  4. I. Conceptual Background A. Primary Question For a specified resource base, can we improve the balance of quality/cost/burden/risk in official statistics by integrating survey and administrative record data?

  5. I. Conceptual Background (continued) B. Possible Example: U.S. Consumer Expenditure Survey 1. Goal: Collect data on a wide range of consumer expenditures and related demographics 2. Current approach a. Household sample survey – complex design b. Personal visit and telephone collection

  6. I. Conceptual Background (continued) 3. Issues re cost and perceived burden (60+ minutes average interview time; cognitive complexity) 4. BLS currently exploring a wide range of redesign options 5. Prospective use of administrative-record data, and its long-term impact on the balance of quality, cost, burden and risk?

  7. I. Conceptual Background (continued) C. Possible Cases 1. Sales data from retailers, other sources - Aggregated across customers, by item - Possible basis for imputation of missing items or disaggregation of global reports 2. Collection of some data (with permission) through administrative records (e.g., grocery loyalty cards), linked with sample consumer units

  8. I. Conceptual Background (continued) D. Framework: Population of Consumer Purchases Defined by Cross-Classification of: - Classification of product/service, time, geography - Characteristics of purchaser (Consumer? Demographics?) - Admin: Outlet, intermediaries (financial, other) E. How to modify estimation methods to incorporate administrative data? - Weighting and imputation for CPI cost weights, commonly produced tables - Construction of public-use datasets

  9. II. Two Approaches to Integration of Survey and Administrative Record Data A. Survey Core 1. Relatively standard sample survey design a. Possible use of administrative record data for frames, selection probabilities, weights b. Primary data collection through standard survey methodology 2. Supplement survey data with administrative records a. Problematic variables (burden, data quality) Current Example: U.S. National Immunization Survey b. Quality checks (microdata or aggregate levels)

  10. II. Two Approaches to Integration of Survey and Administrative Record Data (Continued) B. Administrative Record Core 1. Access administrative record data (at microdata or partially aggregated levels) 2. Per Lessler (2006), supplement as needed for inferential goals a. Fill in for incomplete population unit coverage b. Collect variables not captured in administrative records c. Adjust for data quality issues (e.g., timeliness or aggregation effects

  11. II. Two Approaches to Integration of Survey and Administrative Record Data (Continued) C. Design Features 1. Generally differ substantially between the survey-core and administrative record core approaches 2. Need to consider both methodological and managerial components of design

  12. III. Methodological Issues Comparison and contrast of the “Survey Core” and “Administrative Record Core” approaches will involve a wide range of methodological issues A. Methods for Evaluation of Properties of Prospective Administrative Record Sources 1. Population aggregates (means, totals) 2. Variable relationships (regression, GLM) 3. Cross-sectional and temporal stability of (1) and (2)

  13. III. Methodological Issues (Continued) B. Methods for Integration for Sample Survey and Administrative Record Data: Adaptation of Methods from: 1. Partitioned designs (“multiple matrix sampling”) in education, health statistics 2. Multiple-frame designs (e.g., Lohr and Rao, 1999, 2003) - Frames may capture subpopulations through fundamentally different classification structures

  14. III. Methodological Issues (Continued) C. Importance of Clarity on Sources of Variability Considered in Evaluation of Bias, Variance and Other Properties 1. Sources: Superpopulation effects Sample design (e.g., subsampling, matching) Unit, wave and item missingness or time lags Aggregation effects (temporal, cross-sectional) Reporting error (definitional, temporal, other) Imputation effects (including model lack of fit) 2. Conditioning and integration

  15. III. Methodological Issues (Continued) D. Working Model for Methodological Properties X = Frame, weight information Y = Sample survey data Z = Additional administrative record data Properties of estimator based on variability from: 1. Population structure 2. Administrative and survey collection processes (“filters”) 3. Homogeneity of (1) and (2) across cases

  16. III. Methodological Issues (Continued) E. Formal Evaluation of Properties Evaluate expected mean squared error with respect to each component in (D.1) and (D.2) Current information available at conceptual, empirical levels? Critical importance of understanding the underlying processes for collection and reporting of administrative data Ex: Propensity of a household or business to provide informed consent to link? Ex: Homogeneity of data quality characteristics over time?

  17. III. Methodological Issues (Continued) F. Prior Literature (Examples) Davern (2007, 2009) Demers (2009) Federal Committee on Statistical Methodology (1980) Fulton et al. (2009) Herzog, Winkler and Scheuren (2007) Jabine and Scheuren (1985) Jeskanen-Sundstrom (2007) Ord and Iglarsh (2007) Penneck (2007) Royce (2007) Winkler (2009)

  18. III. Methodological Issues (Continued) G. Prior literature: Two concepts of data quality 1. Per Davern (2007), extent usual ideas of “total survey error” (TSE) to administrative data: (Estimator) – (True value) = (frame error) + (sampling error) + (incomplete-data effects) + (measurement error) + (processing effects)

  19. III. Methodological Issues (Continued) 2. Broader definitions of data quality, e.g., Brackstone (1999): Accuracy (all components of TSE) AND: Timeliness, Relevance, Interpretability, Accessibility, Coherence 3. Risk: Degradation in any component of data quality a. Aggregate risk: Historical focus of quantitative work b. Systemic risk: Often important for statistical programs - cf. “complex and tightly coupled systems” (Perrow, 1984, 2009)

  20. III. Methodological Issues (Continued) H. Cost Structures 1. Statistical products (including surveys and administrative records) require substantial investments (often in intangibles) a. Data originators: - Initial administrative purpose - Accommodate statistical agency (data quality, learning curve, systems) b. Statistical agencies - Learning curves - Systems for acquisition, edit/imputation - Disclosure limitation

  21. III. Methodological Issues (Continued) 2. Broad acknowledgement of substantial costs 3. Less empirical information generally available on: a. Relative magnitudes of specific cost components b. Extent of homogeneity of results from (a) with respect to: - Type of administrative/business organization - Type of administrative records - Subpopulation - Other factors

  22. III. Methodological Issues (Continued) 4. Level of precision available on cost information: a. Purely qualtitative b. Order of magnitude c. Relatively precise 5. Practical uses of cost information a. Qualitative decisions among options b. Fine-tuning specific procedure 6. Sources of information (F. LaFlamme, 2008) a. Special studies (risks: Hawthorne, incomplete accounting) b. Cost-recovery contract accounting

  23. III. Methodological Issues (Continued) I. Burden: 1. Respondent burden a. Elapsed time for collection, related activities b. Cognitive complexity c. Perceived sensitivity d. Informed consent - Direct access and linkage with survey - Obtained during original administrative- record work?

  24. III. Methodological Issues (Continued) 2. Organizational burden a. Informed consent b. Record linkage c. Data management d. Data quality evaluation and adjustment

  25. IV. Empirical Issues A. Properties of Input Data and Final Estimators B. Cost Structures 1. Obtaining Data: Contractual costs with provider Agency personnel (expertise) 2. Modification and maintenance of production systems C. Case studies are important, but may not allow inference to broader populations, variables

  26. V. Managerial Issues A. Central Issue: Management of Costs and Risks - Methodological risks (commonly studied) - Operational risks (“execution risks”) B. Contractual Structure: 1. Performance Criteria and Incentives for Data Provider (Timely Delivery, Quality, Notice on Changes) 2. Stability of Prospective Sources (AOL in 1999, 2011) 3. Changes in Agency Requests (New Products, New Channels) C. Agency Personnel: Skills, Incentives and Institutional Culture

  27. V. Managerial Issues (Continued) D. Contrast Between 1. Incremental risks (per standard statistical methodology) 2. Systemic risks cf. literature from Perrow (1984, 1999) and others on risks in “complex and tightly coupled systems”

  28. VI. Closing Remarks A. Design Issues in Integration of Survey and Administrative Record Data B. Goal: Improve Balance Among Quality, Cost, Burden and Risk C. Distinction Between “Survey Core” and “Administrative Record Core” Approaches D. Impact on Methodological Design and Management Design E. Importance of Development of a Spectrum of Empirical Results

  29. John L. EltingeAssociate CommissionerOffice of Survey Methods ResearchU.S. Bureau of Labor Statisticswww.bls.gov202-691-7404Eltinge.John@bls.gov

More Related