580 likes | 761 Views
Lecture 4 - Survey design. Sampling Sample size/precision Data collection issues Sources of bias Critical review of survey reports. Why do surveys?. Information on particular population prevalence of a disease behaviour, knowledge, attitude Planning of services
E N D
Lecture 4 - Survey design • Sampling • Sample size/precision • Data collection issues • Sources of bias • Critical review of survey reports
Why do surveys? • Information on particular population • prevalence of a disease • behaviour, knowledge, attitude • Planning of services • Collect information on data not routinely available: • e.g., mental health status, health behaviours • Repeat surveys to monitor trends (serial cross-sectional studies)
Bias and precision of the survey estimates • Bias: • selection bias relates to sample selection • information bias relates to information collected (measurements) • Precision • relates to sample size
Study bias and precision vs measurement validity and reliability • Bias/validity: • does measurement/study estimate reflect true state of affairs • Precision/reliability • if measurement/study is repeated, will similar result be obtained?
Reasons to sample • Reduce cost • Increase accuracy and quality of data collected
Definitions • Sampling unit • person or group (e.g., household) • Sampling frame • list of sampling units in the population • censuses • electoral lists • telephone lists • are institutional populations excluded (e.g., prisons, nursing homes)
Target and study population • Target population: • population for generalization of results • Study population: • population for collection of data • may be total target population or a sample
Types of sample • Non-representative • convenience • volunteers • Representative • simple random • systematic • cluster • multistage
Simple random sample • Each sampling unit in the population has equal probability of being included • Sampling with replacement: • each unit placed back in pool • Sampling without replacement (usual method): • each unit selected is kept out of pool
Simple random sample (cont’d) • Methods: • manual • tables of random numbers • computer-generated random numbers
Systematic sample • Select every nth individual from a list • can use existing numbers • e.g., patient appointments, medical records • Advantages: • Does not require complete sampling frame • Simple to carry out • Disadvantages: • May be unsuitable for cyclic or ordered data (e.g., every 5th patient when only 5/day)
Stratified sampling • Separate sample selected from different strata of population • Requires separate sampling frame for each stratum • Useful if there are small but important subgroups of the population (e.g., very old, very young, institutionalized, sick)
Cluster sampling • Sampling unit is a group (e.g., household, village, school) • Step 1: Simple random sample of groups • Step 2: All members of group included in sample • Advantages: • enumeration of population not needed • more efficient use of resources
Multistage sampling • Larger units sampled in first stage, smaller units later • e.g.: • stage 1 - sample of towns • stage 2 - sample of city blocks or census tracts • stage 3 - sample of households
Sampling for “hidden populations” • Homosexual men: • gay bars, newspapers • Injection drug users: • convenience sample (e.g., treatment facilities) • snowball sampling (through networks) • Capture-recapture methods • identify biases of sampling method
Planning a survey • Define target population • Select method of sampling • sampling unit, sampling frame, etc • Calculate sample size • Define survey data collection methods • Non-respondents • number of attempts to reach • different days, times
Sample size estimations • Requirements: • level of precision (width of confidence interval) • expected variability (estimated from previous studies, pilot study, or literature)
Design of questionnaires • List study variables • Collect existing questions and instruments • Adapt and/or develop new questions • Format questionaire • Pre-testing (timing, responses, clarity, etc.) • Revise, determine priorities, shorten
Question wording: clarity • Use concrete rather than abstract terms, e.g., • During a typical week, how many hours do you spend doing vigorous exercise? • Not: How much exercise do you get? • Avoid jargon, technical terms, slang • Avoid double-negatives (Do you disagree that doctors should not make house calls?) • Use active vs passive voice (Has a doctor ever told you vs Have you ever been told by a doctor?)
Question wording: clarity • Break long sentences into short ones (20 word or fewer) • Use good grammar but use informal style • Avoid hypothetical questions • Evaluate reading level (normally not more than 8th grade)
Question wording: neutrality • Do not suggest desirable response, e.g.: • Not: do you ever drink alcohol? • Better: how often do you drink alcohol? • Give permission to give undesirable response e.g.: • Sometimes people forget to take medications their doctor prescribes. Do you ever forget (or how often do you forget) to take your medications?
Question wording • Introduce attitude questions, e.g.: • People have different opinions about their medical care. We are interested in your opinion. • Avoid double-barreled questions • How much coffee or tea do you drink each day? • Avoid assumptions • How much help do you get from your family?
Response wording • Make them short • Use as few options as possible • Consider different types of non-response: • refuse • don’t know • no opinion • not applicable • omission by subject or interviewer
Response wording • Make sure responses are mutually exclusive (or give instructions to “check all that apply”) • Consider use of response card for multiple questions with same set of responses
Organization of questionnaire • Group questions by subject matter • Introduce each group with short descriptive statement (e.g., now I am going to ask you some questions about your use of health services) • Begin with more emotionally neutral questions • More sensitive questions (e.g., income, sexual function) near end of questionnaire
Organization of questionnaire • interviewer-administered: repeat time frame fairly frequently • self-administered: repeat time frame at top of each page or each set of questions, e.g.: During the past year, how many times have you: • Visited a doctor? • Been a patient in an emergency department? • Been admitted to hospital?
Organization of questionnaires • Group questions with similar response scale • Format skip patterns • screener questions • branching questions • Time frame • group questions that ask about same time frame • “usual” behavior vs specified time period • assist respondent with milestones to help define reference time frame
Questionnaire mode • Face-to-face • Telephone • Mail • Other: • diaries • Mixed mode
Face-to-face interviews:advantages • reduce items with no response • easier for older, less educated, lack of fluency in language • some formats easier to administer: • skip patterns to avoid irrelevant questions • open-ended questions - can probe for more complete response
Face-to-face interviews:disadvantages • cost • time • effort (interviewer training, evaluation of inter-rater reliability) • interviewer biases • differences in sociodemographic characteristics of interviewer and subject
Telephone interviews:advantages • less expensive than face-to-face • reduce items with non-response • some formats easier to administer: • skip patterns to avoid irrelevant questions • open-ended questions - can probe for more complete response • large, representative samples can be organized from one office • avoids bias associated with appearance of interviewer
Telephone interviews:disadvantages • misses households without telephone • misses those with unlisted ‘phone numbers • bias when calls made during day • multiple calls may be needed • perceived as intrusive by some • difficult to administer items with multiple response options
Mailed questionnaires:advantages • least expensive • can be coordinated from one office • social desirability minimized • inconsistent results on completeness of reporting (e.g., for # MD visits)
Mailed questionnaires:disadvantages • relatively low response rates • multiple mailings, cover letter, letterhead, advance warning, token of appreciation, SSAE • difficult to get information on non-respondents • differences between early and late responders • items may be omitted: 5-10% may be unusable • cannot control order of questions • postal strikes
Analysis of surveys • Missing data • exclude • imputation: e.g., based on characteristics of respondents • sensitivity of estimate to method of imputation • Weighting of estimates • for stratified samples
Analysis of surveys (cont’d) • Crude estimates, confidence intervals • Continuous data: Mean, median, quartile • Categorical data: proportion • Confidence intervals to describe precision
Bias and precision of the survey estimates • Bias: • selection bias relates to sample selection • information bias relates to information collected • Precision • relates to sample size
Selection bias in surveys • Does the final analysis sample represent the original target population? • Sources of bias: • sampling method • non-response • missing data
Information bias in surveys • Bias in measurement of outcomes • Sources of information bias: • non-validated measurement instrument • unblinded or poorly trained data collectors • response set • etc.
Critical review of an article describing prevalence or incidence(Loney et al, 1998) • Are the study methods valid? • What is interpretation of the results? • What is the applicability of the results?
Are the study methods valid? • Appropriate study design and sampling methods • Appropriate sampling frame • Adequate sample size • Suitable outcome • Unbiased measurement of outcome • Adequate response rate
What is interpretation of the results? • Are the estimates of prevalence or incidence given with confidence intervals and in detail by subgroup, if appropriate?
What is the applicability of the results? • Are the study subjects and the setting described in detail and similar to those of interest to you?
CSHA: Are the study methods valid? • Appropriate study design and sampling methods • Appropriate sampling frame • Adequate sample size • Suitable outcome • Unbiased measurement of outcome • Adequate response rate
CSHA: study design and sampling methods • Prevalence survey with 2 analytic studies appended • Target population: Canadian population aged 65 and over • Exclusions: • Yukon and NW territories • Indian reserves, military units • persons with life-threatening illnesses • not fluent in French or English
CSHA: Appropriate study design and sampling methods (cont’d) • 18 study centres across Canada • 36 cities and surrounding rural area • selected for accessibility to study centres • included 60% of population aged 65+
Sampling frame: community sample • Sampling frame for community sample: • Medicare (provincial health insurance plans) • In Ontario: used Enumeration Composite Record (aggregate based on election records and municipal records) • Stratified random sampling by age: • 65-74 • 75-84 (twice sampling fraction of 75-84) • 85+ (2.5x sampling fraction of 75-84)
Sampling frame: institutional sample • Nursing homes, chronic care facilties, collective dwellings (e.g., convents) • 3 centres sampled from insurance lists • Other centres used multistage sampling: • stratified sample of institutions: • small (up to 25 beds) • medium (26 - 100 beds) • large (more than 100 beds) • random sampling within selected institutions
Sampling (cont’d) • Person who could not be contacted or who refused was replaced with another from same age group, same sex, same geographic region. • Target for each region: • 1800 from community sample • 250 in institutional sample
Adequate sample size? • Target sample in each region: • 1800 in community • 250 in institutions • Assuming institutional prevalence of 50% • 95% CI of 6% • Assuming community prevalence of 5% • 95% CI of 1%