460 likes | 546 Views
Quick and Painless Introduction to Survey Methodology. R. Michael Alvarez PS 120. Testing Theories or Models. Experimental data: expensive, and has validity problems Quasi-experimental data: aggregate election statistics, other data. Suffers from various problems.
E N D
Quick and Painless Introduction to Survey Methodology R. Michael Alvarez PS 120
Testing Theories or Models • Experimental data: expensive, and has validity problems • Quasi-experimental data: aggregate election statistics, other data. Suffers from various problems. • Survey data: data about individual voters
Fundamentals of Surveying • Population: all elements of interest, usually in a geographic area • Sample: subset of population • Sample frame: list of sample (addresses, phone numbers, email addresses, etc)
Basic Typology of Surveys • Probability designs: population elements have a known (at least in theory) probability of selection into the sample. • Nonprobability designs: population elements have an unknown probability of selection into the sample • All of the statistical tools we use to study survey data are based on probability designs!
Literary Digest Methodology • Sent out 10 million straw ballots, using a list drawn from auto registration lists and telephone books. • 2.3 million were returned, about 25% response. • Flawed sample (overrepresented rich and Republicans) • Low response rate
Literary Digest Fiasco Reforms Polling • Underlying flaws of Literary Digest straw polls revealed --- not using a scientific sampling procedure • Others, especially Gallup, Roper and Crossley began to work to find better ways of generating samples … • The Literary Digest soon folded!
New Sampling Techniques Were Flawed! • Before 1948, they used “quota sampling” • Each interviewer is assigned a fixed quota of subjects to interview from certain demographic categories … gender, age, education, residential location. • Once they met their quota, the interviewer could select anyone they desired until they conducted all their required interviews
Quota Sampling • It’s not necessarily a stupid idea, as long as the underlying data (Census data?) used to construct the parameters of the sample are okay. • But, what can happen is that interviewers end up working to talk with people who are easy to contact. In 1948 that tended to be people in nice neighborhoods, with fixed addresses and phones (ie, Republicans).
Random Sampling • In the 1950’s, most scientific surveys shifted to the use of random sampling • For example, Gallup in 1956 moves to the use of random selection methods and seems to generate more accurate presidential election forecasts thereafter
Basic Introduction to Sampling • Concept: The population (or universe or target population). • The population is the entire set of units to which a survey will be applied. Individual members of the population are called units or elements.
More on sampling ... • Next, we need a list of population units from which we can draw a sample. • This list is called the SAMPLE FRAME • The basic property of a sample frame is that every unit in the population has some known chance of being selected into the sample by whatever method is used to select units
Then ... • Probability sample: units are selected using a method that insures that each unit has a known, nonzero probability of being included. • Nonprobability sample: units are selected and inclusion probabilities are unknown (quota sampling …)
Simple Random Sampling • All elements of population have equal probability of being sampled • Cluster sampling: population is divided into clusters or groups, and clusters are sampled. Why? Cost and simplicity. • Stratified sampling: population is divided into subpopulations, or strata, and sampling occurs within strata. Why? Strata might be of interest or require different methods of analysis.
Sampling Error • Best way to think of survey error is in the context of proportions (percent saying “yes” or “no”). • Standard error of a proportion in SRS: • se(p) = sqrt[ ( p(1-p) )/( n - 1 )]
NES Response and Refusal Rates Response rate: interviews net of refusals and respondents who cannot provide an interview (e.g., language, etc)
Misreporting: Voting in Recent Federal Elections Note: Percentage of voting age population
Item Nonresponse • Don’t know is necessary in any survey, so that people can tell you if they don’t have an opinion • Due to uncertainty, vague questions, or respondent unwillingness to answer some questions
Should Gov’t Provide More Services? 1996 NES
Certainty of Responses? Senator Position on Abortion Scale, Alvarez and Franklin 1993
Question Wording and Order? • Would you say that traffic contributes more or less to air pollution than industry? (45% traffic primary contributor to 32% industry). • Would you say that industry contributes more or less to air pollution than traffic? (57% industry primary contributor to 24% traffic) • Wanke et al. 1995
Types of Surveys • Self-administered questionnaires (mail, web) • cheap • but: • low response rates • uncertainty about who completes questionnaire
Types of Surveys • Telephone: RDD/CATI • quick, random? • Uncertainty about respondent, difficult to ask complex questions, must be short
Types of Surveys • Face-to-face (on doorstep, exit polls) • highly accurate, high response rates • very expensive to implement • interviewer biases are problematic
Internet surveying --- the future? • Cheap to implement • Quick in the field, quick with analysis • Can implement complex designs, for example, use multimedia
Basic types of Internet surveys • Probability designs • Nonprobability designs • Mixtures of probability and nonprobability
Intercept-based surveys of visitors to particular web sites known email lists (students, etc) Probability-based Internet surveys
Nonprobability Internet surveys • Entertainment surveys • Self-selected surveys • Volunteer survey panels
Sampling error (difference between sample and pop.) Coverage error (deviation between sample and frame) Systematic sampling error; error in frame Nonresponse (unit) bias Nonresponse (item) bias Question wording or ordering effects Interviewer error; coding mistakes Surveys are not perfect!
Sample size Sampling methodology (probability or non-probability) Estimated sampling error Survey response rate Questionnaire design and question wording Item response rates Intuition: do the results make sense? How do I evaluate survey results?
Caltech’s NationalPublic Relations Initiatives March 11, 2003
Brief recap of survey methodology • Survey conducted by ICR • Wednesday, February 12-Sunday February 15 • Omnibus survey • N=1010 • Tabulation presents weighted results, weighted to map to American adult population
Questions • Considering what you might have seen or heard about the California Institute of Technology, also known as Caltech, in Pasadena, California, which of the following best describes your opinion of Caltech’s reputation. Would you say Caltech’s reputation is excellent, good, fair, or poor?
Questions • How did you hear about Caltech? (not asked to those unable to answer 1). • What do you think Caltech is best known for (not asked to those unable to answer 1).
Questions • Now, as I read each of the following topics, please tell me, generally speaking, whether or not you are interested in the topic: voting, the brain, climate changes, astronomy, earthquakes, nano-technology, detecting gravity waves
Questions • And considering those topics in which you said you had an interest, how do you usually get news and information about these topics? (asked to only those who were interested in at least one topic)
Media Relations Focus • National and northeast TV - visit them, pitch them, invite them to campus. • Households with children • Senior-oriented media
Evaluate the Caltech Awareness Survey • Technical evaluation • Substantive evaluation • Policy evaluation