500 likes | 609 Views
Optimizing Survey Research Quality: Ensuring the Reliability and Validity of Estimates of the Public Health. Presentation to Ohio Society for Public Health Education September 13, 2013 Orie V. Kristel, Ph.D. The Strategy Team, Ltd.
E N D
Optimizing Survey Research Quality: Ensuring the Reliability and Validity of Estimates of the Public Health Presentation to Ohio Society for Public Health Education September 13, 2013 Orie V. Kristel, Ph.D. The Strategy Team, Ltd.
The linkage between high quality survey research and Ohio SOPHE’s mission Healthier People Through Health Education Surveys that are consistent with best-practice methods provide reliable and valid estimates of the public’s health, which health professionals can use to ensure health education resources are applied to the areas of greatest need.
Many well-known public health surveys in the U.S. • National Health and Nutrition Examination Survey • assesses the health and nutritional status of adults and children in the United States, combining a survey interview with a physical examination • mode: in-person • National Health Interview Survey • covers a broad range of health topics (e.g., health status, health care access, and progress toward national health objectives) • one of the best current sources of information regarding prevalence of wireless-only households • mode: in-person • Behavioral Risk Factor Surveillance System • state-specific estimates of adults’ behaviors and preventive health practices that can affect their health status • mode: telephone (>506,000 in 2011!); piloting Web, mail
In Ohio, some BRFSS data available at the MSA level… And some available at the county level… • Summit Co. (Akron) • Stark Co. (Canton) • Hamilton Co. (Cincinnati) • Cuyahoga Co. (Cleveland) • Franklin Co. (Columbus) • Montgomery Co. (Dayton) • Lucas Co. (Toledo) • Mahoning Co. (Youngstown)
Health education efforts and health improvement plans are often implemented at the local level. So, what do you do when local level data aren’t available? Collect it yourself or in collaboration with public, non-profit, and private sector organizations and partners.
“Total Survey Error” – a framework for thinking about survey design and accuracy Refers to “the accumulation of all errors that may arise in the design, collection, processing, and analysis of survey data” (Biemer, 2010; Groves & Lyberg, 2010). Errors reduce the accuracy and precision of the data collected – and therefore, the usefulness of the data (i.e., the inferences and conclusions we draw from it).
Two broad families of errors Our focus today will be on ways to optimize the reliability and validity of the estimates from questions you may wish to ask of the public – which means reducing the likelihood of specification or measurement errors.
Defining the problem SPECIFICATION ERROR When the concept or construct measured by the survey question differs from what the researcher intended • Validity threat MEASUREMENT ERROR When the survey question is answered or measured incorrectly • Reliability and validity threat
Today’s goal and objectives Goal of today’s session: Provide you with (some) theory and (more) practical advice to help you craft health-related questions that will yield valid and reliable data. Such data can be analyzed and interpreted to aid your education and planning efforts. Specific objectives for today’s session: • Review general principles of good questionnaire design and ways to design survey instruments to yield more reliable and valid data • Review common biases and errors in survey questionnaire development (and how to avoid them)
Defining a “good questionnaire” …Is easy to administer …Yields reliable data (i.e., questions are interpreted and answered similarly by different respondents) …Accurately measures the constructs for which the survey was designed (Pasek & Krosnick, 2009)
Recap: Defining the problem SPECIFICATION ERROR • When the concept or construct measured by the survey question differs from what the researcher intended • Threatens an estimate’s construct validity Typically occurs due to a poor operationalization of the construct, concept, or theory of interest
Specification error: An example “Do you have a disability?” Responses to this question are likely dependent on how respondents define (or don’t define) “disability.” People with minor disabilities (whatever “disability” means) may answer “no” – which means the resulting survey statistic may not fully estimate the number of people with disabilities.
Specification error: Other examples “Do you exercise or play sports regularly?” What constitutes “exercise”? If I don’t play sports, does this mean I don’t exercise? Bonus measurement error! How often is “regularly”? ------- “Have you taken a vacation in the last few years?” What constitutes a vacation? Going somewhere out of state? Does staying home and relaxing by the pool count? Bonus measurement error! Does “last few years” mean 2 years, 3 years, 4 years?
Clarity of thought and expression are key • The questionnaire designer must be able to clearly express the concept he/she is trying to measure. • The concept must be represented clearly (and succinctly) in the survey question. An optimized question will focus singularly on the concept being measured. A non-optimized question will likely be ambiguous, which in turn will lead respondents to answer a different question than was intended.
A specific (and common) example of specification error: Double-barreled questions Avoid questions that have more than one central construct or concept, that actually asks two or more questions. “Do you agree that acquired immunodeficiency syndrome (AIDS) can be transmitted by shaking hands with a person with AIDS or through other means of physical contact?” A respondent can disagree with the first part of the question (“AIDS can be transmitted by shaking hands”) but agree with the second part (“through other means of physical contact”). How is the respondent supposed to answer this question? Fix this by asking two questions and fixing the measurement error in the phrase “other means of physical contact” (i.e., what does this include?).
How to identify specification errors So, how can one evaluate whether a specification error has been committed? How can one test for construct validity? The most common methods… • By administering the question alongside previously vetted questions that are known to be correlated with the construct being measured, and analyzing the statistical behavior of the new measure with the previously vetted ones (“convergent / discriminant validity”) • By administering the question to different groups of people that one would expect to score either high or low on the measure, by virtue of some demographic or psychographic difference (“known groups”) • Cognitive pretests with sample members of the population Given limited time and resources, we usually go for option three.
An overview of the cognitive pretest process • Researcher presents the question to the respondent • Respondent reads the question. Researcher then asks whether there was anything confusing about the question or response options. Ideally, the respondent is asked to rephrase the question using his/her own words - a cognitively demanding but useful task • Does the respondent clearly understand the question? • Is the respondents’ understanding consistent with the researchers’ desired intent? • Respondent describes the process by which he/she would answer the question • Are respondents able to formulate a meaningful response? • Researcher revises the questions given the obtained feedback and completes more cognitive interviews to test the revised question
Avoiding Measurement Errors: A Quick Detour throughthe Psychology of Survey Response
An optimal response to a survey question The ideal survey respondent – an optimizer – goes through four cognitive steps (Tourangeau & Rasinski, 1988; Krosnick 1991) when answering a survey question: • Reads or listens to the question and attempts to discern the question’s intent • Searches his/her memory for useful information • Evaluates and integrates the available information into a summary judgment • Answers the question by translating the summary judgment onto the response options provided
An optimizing example “About how long has it been since you last visited a doctor for a routine checkup?” • Attempts to discern the question’s intent: The researcher wants to know the last time I went to the doctor. • Searches his memory for useful information: I usually go once a year for a checkup. I think I went back in the spring. • Evaluates and integrates the available information into a summary judgment: I last visited Dr. Smith in the spring. • Answers the question by translating the summary judgment onto the response options provided: I visited a doctor within the past 12 months.
Optimizing requires cognitive effort… but not everyone is able and willing to give full effort to each question Individuals sometimes answer questions using only the most readily information available OR by looking for cues in the question that may point them toward an easy-to-select or socially desirable response… …and then choose this response so as to do as little thinking as possible. Survey satisficing: The act of taking cognitive shortcuts instead of expending the effort necessary to provide optimal answers (Krosnick & Presser, 2010) When this happens, respondents provide answers that may be (1) loosely related to the construct of interest and (2) inaccurate.
When is satisficing most likely to occur? When question difficulty is high When respondent ability is low • less educated / lower educational attainment When respondent motivation is low • low in “need for cognition” • little perceived value in answering the survey questions accurately • “respondent fatigue”
No surefire way to prevent survey respondents from satisficing The challenge for researchers, then, is to design questions that… • Minimize the incentive to satisfice • Maximize the efficiency of the survey for optimizers
Recap: Defining the problem MEASUREMENT ERROR • When responses to survey questions are answered or measured incorrectly • Threatens an estimate’s reliability and validity Typically occurs due to poorly designed questionnaires or to respondent biases/motivations that run counter to your goals “Reducing measurement error due through better question design is one of the least costly ways to improve survey estimates.” (Fowler, 2002)
PROBLEMS WITH WORDING Open vs. close-ended questions? It depends… Although close-ended questions (response options are provided) are easier to administer, open-ended questions (no response options are provided) discourage satisficing because they require respondents to answer the question in their own words. Many close-ended questions require respondents to first answer an open-ended question (e.g., “What is the biggest health problem facing people in Ohio? Would you say it is… Obesity, Alcohol tobacco and other drug use, Heart disease, Cancer, or something else?”) To minimize measurement error, an open-ended question is preferred when… • The full range of response options is unknown or cannot be presented efficiently • A question is asking for a numeric value
PROBLEMS WITH WORDING Avoid grammatically correct but confusing wordings Jargon, which may not be understood by the general public “What was your age at menarche?” Sentences with complex, double-negative constructions “Has it happened to you that over a long period of time, when you neither practiced abstinence, nor used birth control, you did not conceive?” Uncommon words Help > Assist | Use > Utilize | Start > Initiate | Live > Reside, etc.
PROBLEMS WITH BIASED QUESTIONS Consider how response options are framed A great amount of research has explored the effects of differently ways of framing or presenting information, oftentimes resulting in respondents choosing an objectively inaccurate answer. “Which operation would you prefer?” • An operation that has a 5% mortality. • An operation in which 90% of the patients will survive. Similarly, research has explored on how people react differently to prevention-vs. promotion-framed health messages
PROBLEMS WITH BIASED QUESTIONS Do not use leading questions / response options The question wording or the response options provided may subtly guide respondents toward a specific answer. “How would you rate the quality of the services provided by your local health department?” [Excellent] [Very Good] [Good] [Fair] [Poor] From a March 2013 survey of Ohio registered voters: “Do you agree or disagree with the following statement? (And is that strongly or somewhat agree/disagree?)… Now is not the time for politicians in Washington to raise energy taxes. They should solve the country's budget issues without hurting consumers and taxpayers.” - 81% agreed with this quadruple-barreled, leading question
PROBLEMS WITH ACQUIESENCE BIAS Minimize the potential for acquiescence response bias Acquiescence response bias refers to the motivation to be agreeable (Brown & Levinson, 1987), a tendency that can influence how people respond to survey questions. All else being equal, people (especially satisficers) are more likely to say yes to “yes/no” questions. So, provide a range or set of response options that allow respondents to easily find their opinion or viewpoint. ‘‘Do you think the health department should make sure every child receives a measles vaccination?” [Yes] [No] Fix: ‘‘Do you think the health department should or should not make sure every child receives a measles vaccination?” [Should] [Should not]
PROBLEMS WITH ACQUIESENCE BIAS Minimize acquiescence bias: no agree/disagree questions! “Do you strongly agree, somewhat agree, somewhat disagree, or strongly disagree with the following statement… My health is poor.” All else being equal, people are more likely to say agree than disagree. There are people (10-20%) who would agree with both the above statement (i.e., “my health is poor”) as well as its opposite (i.e., “my health is not poor”), depending on how the question was worded (Schumann & Presser, 1981). Agree/disagree questions can also be confusing for some people to understand. In the above example, a person who wants to report his health is good must disagree with the statement that his health is poor. Although easy to write, agree/disagree questions have a host of problems and can encourage satisficing. Nearly any question written in an agree/disagree format can be rewritten to focus directly on the concept being measured!
MOTIVATIONS TO ANSWER INACCURATELY Be careful when asking sensitive questions Responses to sensitive questions (e.g., sexual behaviors, abortion, alcohol/tobacco/other drug use, income, voting) may be censored or intentionally misreported. Can occur due to concerns about threat of disclosure (e.g., what if authorities or others find out?) or social desirability bias, the motivation to be perceived in a way consistent with social norms. The entire survey experience – from interviewer rapport with the respondent to the questions asked – must make it clear that honest answers are needed and that responses will be held confidential. That said, research suggests that self-administered surveys elicit more honest reporting than interviewer-administered surveys.
PROBLEMS WITH RESPONSE OPTIONS Ensure no intervals are missing “How often do you get at least 30 minutes of exercise?” [ ] Less than once per month[ ] Once per month[ ] Once per week[ ] More than once per week Fix: “How often do you get at least 30 minutes of exercise?” [ ] Less than once per month [ ] Once per month to once per week[ ] More than once per week
PROBLEMS WITH RESPONSE OPTIONS Ensure intervals do not overlap “How many cigarettes do you smoke per day?” [ ] None [ ] 5 or less [ ] 5-25 [ ] 25 or more Fix: “How many cigarettes do you smoke per day?” [ ] None [ ] 1-4 [ ] 5-24 [ ] 25 or more
PROBLEMS WITH RESPONSE OPTIONS Use five or seven fully-labeled response options for attitudinal scales Unless you are measuring a numeric event, avoid numeric response options and partially-labeled scales (e.g., 1=“not at all likely” and 5 = “extremely likely”). When measuring a unipolar construct (i.e., the absence or presence of a quality, attribute, idea, or state), five fully-labeled points yield more responses with less measurement error. When measuring a bipolar construct (i.e., balancing two opposite qualities, attributes, ideas, or states), seven fully-labeled points yield responses with less measurement error.
PROBLEMS WITH RESPONSE OPTIONS Offer alternative response options (when appropriate) Provide response alternatives if it is reasonable to expect some respondents to generate answers to a question that are not in the response option list. “Which of the following types of doctors did you see in the past year?” [ ] family doctor[ ] pediatrician[ ] lung doctor/internist[ ] allergy doctor/immunologist[ ] emergency room doctor [ ] Other, please specify: ______________
PROBLEMS WITH RESPONSE OPTIONS Offer “don’t know” response options sparingly (if at all) “When respondents are being asked questions about their own lives, feelings, or experiences, a ‘don’t know’ response is often a statement that they are unwilling to do the work required to give an answer” (Fowler, 2002) Don’t know is an attractive response option to survey satisficers.Studies that have compared responses to questions that vary in the use of DK responses found that people who are less interested in thinking or less educated are more likely than others to select DK responses – giving them an easy way out and you a false measurement. Furthermore, a person’s DK response may mean: 1) no attitude or opinion toward the construct/concept; 2) confusion about how to best translate one’s attitude to the response options provided; or 3) concern about disclosing sensitive information.
PROBLEMS WITH QUESTIONNAIRE FORMAT Avoid matrix/grid style questions These encourage “straightlining,” a satisficing response in which people provide the same answer to each question item, which increases error. Some research indicates that item nonresponse (e.g., not answering questions) is more likely with matrix questions than if one were to ask each question individually.
PROBLEMS WITH QUESTIONNAIRE FORMAT Beware primacy/recency effects For a list presented visually, primacy effects are more likely (i.e., first option is more likely to be picked than the last). For a list presented verbally, recency effects are more likely. These order effects are also seen in elections… political candidates who are listed first on a ballot receive a 2-3 percentage point advantage over other candidates. “Which of the following types of doctors did you see in the past year?” [ ] family doctor[ ] pediatrician[ ] lung doctor/internist[ ] allergy doctor/immunologist[ ] emergency room doctor Fix: Randomize the order of response options presented to each respondent
PROBLEMS WITH QUESTIONNAIRE FORMAT Provide structure for the respondent When designing a self-administered surveys (i.e., online or mail), be sure to give exact instructions to the respondent on how to record answers (e.g., check a box, circle a number, etc.) Make sure the answer to one question relates smoothly to the next • If using an online survey, use SKIP logic. • If using a paper survey, use arrows or other very clear directions to show the desired question flow.
PROBLEMS WITH QUESTIONNAIRE FORMAT Create an aesthetically pleasing, readable survey Make the survey look easy for a respondent to complete. Plan its format carefully, use subheadings, spaces, etc. Be finicky about fonts. Try to use a Sans Serif font. Try not to use a Serif font. Don’t use Comic Sansor Papyrus fonts. If programming a survey to be administered online, try to limit the number of questions to a page to one or two.
An opportunity to put into practice some of the concepts we’ve been discussing For those of you here in the room with me, please arrange yourselves into small groups of four. For those participating via the Internet, please work by yourself. I’ll provide some question concepts or actual examples from public health surveys. I’d like your group to write a better version of the question (i.e., one that is more likely to yield reliable and valid data). Each group will then share its improved question wording with the others in attendance.
Exercise 1 To measure health: “How healthy are you?”
Exercise 2 To measure satisfaction with life: “How would you rate your life – very good, better than average, mixed, could be better, or very bad?”
Exercise 3 To measure perception of food insecurity: “How often in the past 12 months would you say you were worried or stressed about having enough money to buy nutritious meals? Would you say you were worried or stressed – always, usually, sometimes, rarely, or never?”
Exercise 4 To measure perceived harm of second-hand smoke: “What do you think is the effect on one's health of second-hand smoke – harmless, somewhat negative, or very bad?”