1 / 19

The Utility of Metadata for Questionnaire Design and Evaluation

The Utility of Metadata for Questionnaire Design and Evaluation. Jim Esposito Bureau of Labor Statistics Washington, DC

galia
Download Presentation

The Utility of Metadata for Questionnaire Design and Evaluation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Utility of Metadata for Questionnaire Design and Evaluation Jim Esposito Bureau of Labor Statistics Washington, DC Disclaimer: The views and opinions expressed in this presentation are those of the presenter/author and not necessarily those of the Bureau of Labor Statistics or the Bureau of the Census. QUEST2007: Statistics Canada, Ottawa, Canada

  2. Objectives of Presentation • To draw attention to the concept of metadata and to its scope and relevance • To describe a case study involving the measurement of work/employment that illustrates the utility of metadata in evaluating and designing questionnaire items QUEST2007: Statistics Canada, Ottawa, Canada

  3. Metadata: An Informal Definition • Metadata can be defined as any information (verbal or numeric or code, qualitative or quantitative) that provides context for understanding survey-generated data: • Domain-specific/ethnographic information • Concepts and question objectives • Questionnaire items and administration modes • Instructional materials • Pre- and post-survey evaluation research • Classification algorithms QUEST2007: Statistics Canada, Ottawa, Canada

  4. The Measurement of Labor-Force Status via the CPS • Current Population Survey [CPS] • Official source of LF statistics in USA (e.g., monthly unemployment rate) • CPS measures work, not jobs • 60,000 households a month • Principal LF categories: Employed [EMP], unemployed [UE], not-in-the-labor-force [NILF] • Employed: Work for pay, one hour or more; unpaid work in family business, 15 hours or more; job (but absent last week) • Data collected monthly via two modes [face-to-face and telephone CAPI; centralized CATI] QUEST2007: Statistics Canada, Ottawa, Canada

  5. CPS: Some Relevant Details • The CPS was redesigned in the early 1990s, utilized a multiple-method of evaluation plan (e.g., behavior coding, interviewer and respondent debriefings, split-ballot design) and generated a substantial amount of metadata • The CPS relies on about 16 questionnaire items to generate estimates for its three major labor force categories: EMP, UE and NILF (and various subcategories) • Again, CPS measures work, not jobs QUEST2007: Statistics Canada, Ottawa, Canada

  6. The Measurement of Employment Status via the ACS • American Community Survey [ACS] • Largest survey conducted in the USA; will replace the Decennial Census “long form” • 250,000 households a month • Collects data on a broad range of demographic topics (e.g., population, housing, disability status, employment status, educational attainment, health insurance) • Adheres to BLS employment concept with the same three major categories: EMP, UE and NILF • Data collected continuously via three modes [SAQ (66%), CATI and face-to-face CAPI) QUEST2007: Statistics Canada, Ottawa, Canada

  7. ACS: Some Relevant Details • The ACS was developed over a series of stages (starting in the early 1990s) and achieved full implementation in 2005; there is a substantial amount of metadata documenting this process • At present, the ACS relies on the content of six CPS items (modified for use in the ACS) to generate its estimates for three employment status categories: EMP, UE and NILF • Because of methodological/procedural differences, the CPS and the ACS can not be expected to produce identical estimates QUEST2007: Statistics Canada, Ottawa, Canada

  8. CPS: Work Item and DQ Issues [1] • CPS Work Question[No-business-in-household wording.] • LAST WEEK, did you do ANY work for pay? • Data Quality [DQ] Issues, CPS Redesign • Final evaluation phase (1992-93): Interviewers rated this item as one of the more problematic questions on the redesigned CPS (e.g., Just my job?; Do you mean my regular job?) • On the basis of other evaluation data (e.g., behavior- coding and response-distribution analyses), these “reports” by respondents were determined not to represent a serious data-quality issue because of the likelihood of interviewer mediation and “repair work” QUEST2007: Statistics Canada, Ottawa, Canada

  9. CPS: Work Item and DQ Issues [2] • Data Quality Issues (continued) • Respondent debriefing data indicated that this question did miss some marginal/paid work activity (1.6%): “In addition to people who have regular jobs, we are also interested in people who may only work a few hours per week. Last week, did [name] do any work at all, even for as little as one hour?” • The evaluation work conducted during the redesign was documented extensively by Census Bureau and BLS researchers in the 1990s (e.g., conferences; papers; book chapter); however, much of this work is not cited in ACS evaluation documents QUEST2007: Statistics Canada, Ottawa, Canada

  10. ACS: Work Item and DQ Issues [1] • Current ACS • LAST WEEK, did this person do ANY work for either pay or profit? Mark (X) the “Yes” box even if the person worked only 1 hour, or helped without pay in the family business or farm for 15 hours or more, or was on active duty in the Armed Forces. • Data Quality Issues • ACS underestimates employment (which compromises estimates in the other two categories, UE and NILF)—next slide QUEST2007: Statistics Canada, Ottawa, Canada

  11. CPS vs. C2000/ACS Estimates • CPS/Census-2000 Match Study • “Combined-Month Sample”: February though May, 2000, specific rotations;~86,000 addresses; wt. N: 207,875,749 • CPS vs. ACS-like employment status items • EMP: 64.1% vs. 62.3% (underestimate) • UE: 2.7% vs. 3.4% (overestimate) • NILF: 32.8% vs. 34.0% (overestimate) • Note: The employment status items from the Census-2000 long form are identical to those used in the current ACS. QUEST2007: Statistics Canada, Ottawa, Canada

  12. ACS: Work Item and DQ Issues [2] • Data Quality Issues • Small–scale evaluation [2004]: Expert reviews; behavior coding; focus groups with ACS interviewers • Behavior coding [CATI site; 51 HHs; 104 persons]: • INT codes: exact (78%); major changes (10%); data due in part to prior context [disability questions] • RSP codes: adequate answers (98%); other than simple yes or no (21%); examples (e.g., “For pay, yes.”; Just his “regular job.”; “No, currently unemployed.”) • Read-if-Necessary Statement: Never read • Focus groups: “pay or profit” confusing; multiple-job holders and self-employed (e.g., “Did you mean, other than my regular job?”); read-if-necessary statement rarely read; some interviewers ask about job directly QUEST2007: Statistics Canada, Ottawa, Canada

  13. ACS: Revised Work Items • Revisions to ACS Work Question • (1A): LAST WEEK, did this person work for pay at a job (or business)? [If “no” to 1A, ask (1B).] • (1B): LAST WEEK, did this person do ANY work for pay, even for as little as one hour? • Rationale • Current ACS work question confuses some respondents: Why? • Exploiting two-part question appears to clarify the response task for some respondents and in so doing better achieves the objective of gathering accurate data on work activity and employment status QUEST2007: Statistics Canada, Ottawa, Canada

  14. Estimates of Labor-Force/ Employment Status • 2006 ACS Content Test • January—March 2006; ~ 63,000 addresses, equally split between control/current vs. test/revised groups • Current vs. revised ACS items • EMP: 62.8% vs. 65.7% (plus 2.9%)* • UE: 4.1% vs. 3.6% (minus 0.5%) • NILF: 33.1% vs. 30.7% (minus 2.4%)* • Revised items manifest less bias and variability, as well QUEST2007: Statistics Canada, Ottawa, Canada

  15. The CPS Work Item: Why might it be problematic for some respondents? • Grice (1975): Maxims onQuantity • 1. Make your contribution as informative as is required (for the current purposes of the exchange). • 2. Do not make your contribution more informative that is required. • Fowler (1995): Principles 3 and 3d. • Principle 3: A survey question should be worded so that every respondent is answering the same question. • Principle 3d: If what is to be covered is too complex to be included in a single question, ask multiple questions. QUEST2007: Statistics Canada, Ottawa, Canada

  16. Invoking Grice on Quantity: Hypothetical Example [ACS/SAQ] • LAST WEEK, did you do ANY work for pay? • Respondent [full-time job]: How should I answer this [#!&?@] question? It’s doesn’t mention a “job” and probably would if that’s what they wanted to know. And it specifically says “work for pay”, so it must mean doing work on the side. OK, just check the “no” box. • Reference to a “job” is missing. [Maxim 1] • “Work for pay” is specified, which would seem superfluous (especially for someone with a full-time job): Who works all those hours for free? [Maxim 2] QUEST2007: Statistics Canada, Ottawa, Canada

  17. Resolution for ACS: Two-Part Work Item • Revisions to ACS Work Question • (1A): LAST WEEK, did this person work for pay at a job (or business)? [If “no” to 1A, ask (1B).] • (1B): LAST WEEK, did this person do ANY work for pay, even for as little as one hour? • Part (1A) specifically mentions “job”, “work for pay” and “business”. • Part (1B) captures work for “as little as one hour”? • Not perfect, but better than current ACS item. QUEST2007: Statistics Canada, Ottawa, Canada

  18. Closing Remarks • Even survey questions that appear simple and straightforward may not be for some respondents. [Key issues: Why and how many respondents affected?] • It is risky to import questions from one survey to another, especially when the surveys differ in terms of mode of administration (and in various other ways, too). • In evaluating and “fixing” questionnaire items, quantitative research, alone, is not sufficient. • Summary: Our best hope for optimizing data quality (i.e., minimizing measurement error) is a thorough and critical review of relevant metadata, followed by prudent design-and evaluation decisions that are informed by such reviews. QUEST2007: Statistics Canada, Ottawa, Canada

  19. Thank You • Questions or comments? • Post-workshop: • Esposito.Jim@bls.gov QUEST2007: Statistics Canada, Ottawa, Canada

More Related