260 likes | 354 Views
Ma king Internet Surveys Representative. Willem Saris. ESADE Universitat Ramon Llull Barcelona. The fundamental paradigm of survey research. Sampling theory allowed one to estimate the opinion of the population on the basis of a limited number of people
E N D
Making Internet Surveys Representative Willem Saris ESADE Universitat Ramon Llull Barcelona
The fundamental paradigm of survey research • Sampling theory allowed one to estimate the opinion of the population on the basis of a limited number of people • Survey research became a standard procedure to collect information about the opinion in the population. • Probability sampling has been the paradigm of survey research during the last 50 years
The basic advantage of probability sampling • Using probability sampling the uncertainty in the outcomes can be quantified • On can also control this uncertainty by varying the sample size, or by using advanced estimation procedures.
The disturbing reality • Probability samples are complicated by - coverage error, - non-response error - measurement error. • Survey research with this paradigm requires quite some skills:Sudman and Brandburn 1983, Schuman and Presser 1981, Tourangeau 2000, Saris and Gallhofer 2007)
The new paradigms of Web surveys • Couper: Volunteer opt-in panels are most popular. • At different popular websites or via de media people are asked to participate in surveys on the internet. • If a “sufficiently large” group of volunteers has been obtained, samples are drawn from these pools of volunteers which agree with respect to background statistics with the population of interest. • These people are asked by to participate in a specific survey.
Advantages • Companies can provide results in days where previously weeks were required • For a small amount of the costs of probability samples results of even larger samples can be provided.
Disadvantages 1 • There is no statistical basis to generalize from the sample to the population (Horwitz and Thompson, 1952) . • People must have access to the internet (coverage error) Mostly these respondents are very different from the non-respondents (See e.g. Bethlehem, 2005; Lensvelt-Mulders & Lugtig 2006). • The number of people who answer the questions compared to the number selected (a kind of response rate) is relatively low (Lozar Manfreda, Bosnjak, Haas, & Vehovar, 2005)
Disadvantages 2 • Often not sufficient care is given to the formulation of the questions. This means that very different results are possible (Dillman, 2005). • There is no interviewer that can help with the difficult questions. • No consistency checking: people simply quit the interview when confronted with questions about inconsistencies (Dillman, 2005)
Complete rejection • Billiet has warned in several publications (Billiet 2004, Abt et al. 2005) for these new surveys • He even blamed his colleagues for recognizing these methods by participating in the presentation of results of such surveys.
Statistical adjustments • Harris International has applied a statistical approach – weighting by use of propensity scores - to adjust the voluntary sample to a probability sample (Terhanian , 2001a and 2001b). • Rand Corporation have studied the possibility of weighting web surveys (Schonlau et al 2002 and 2004) . • Other institution in the US (see for example Lee 2006)
European Statisticians • Statisticians in Europe have also started to look at these approaches (Varedian and Forsman, 2003; Forsman and Isaksson 2003; Danielsson 2004; Isaksson, Danielsson and Forsman 2004). • Comparisons of results of access panels and probability samples have been done (see for example Schoen 2004, Faas and Schoen 2006, Oberski 2006, 2007).
Joint research • Joint research has been done by researchers of the old and new approach (Schonlau, Van Soest, Kapteyn, Couper and Winter, 2004)
Results of comparisons • Some of these studies showed that the results of access panels were not significantly different from the results of probability samples, mostly after some correction by weighting • Others showed differences • But yet we do not know when which results will occur.
A Dutch study of web surveys • Zembla in 2006 asked about the use a computer program (Stemwijzer) via an acces panel • The estimate of the use of the program was as far off as 34%. • Only 20% of the people used the program while the access panel suggested 54%.
Further research • The TV program asked three different companies to ask their panels whether they agreed or disagreed with the following three statements
Results for the three questions. • “The program gives an advice which is taken too seriously by many people” The results varied between 42% and 52% • “The program should emphasize more that it only gives an advice” The results varied between 57% and 73 % • “The program has too much influence on the elections” The results varied between 19% and 44%.
Conclusions • The differences can be rather large • Differences will occur when there is a relationship between the reasons for participation and the variable of interest • The problem is that we do not know when this is the case. • Our knowledge about the reasons for participation in such access panels is still rather limited.
What we know about participation 1 • People without internet access can not participate • Less participation can be expected of - older people, - people from non-western countries, - people who are less involved in society, - with less interest in the topic (Faas and Schoen 2006) - less politically interested (Vehovar 2002 and Bosnjak 2002)
What we know about participation 2 • People who participate more are people who do it for money • In the Netherlands: 20% of the total number of people participating in the panels answers 80% of the questions.
Alternative 1: Web panels • Couper (2000) suggests that the best possible option is to use web surveys based on probability samples providing equipment to the households if necessary • Procedure was developed already in 1986 under the name Telepanel • Now available at Centerdata and Knowledge net.
Advantages and disadvantages • A probability sample is used • A lot of information is available about the respondents • It is a lot of work to manage such a panel • The response rate is much lower than in cross sectional research
Possible corrections • A lot of information is available • But not only correction for background variables is needed • Also correction for variables related with nonparticipation: Political interest (Voogt) • However the reasons for nonparticipation are different in different countries (ESS)
Alternative 2: Mixed mode data collection • Draw a probability sample • Ask potential respondents with internet to fill in a web survey and others a mail questionaire • The people who do not reply are contacted by telephone and asked to participate by telephone or face to face interview • Eventually ask them to answer only some central questions
Advantages • Probability sample so generalization possible • Higher response rates up to 90% are possible • Sufficient information available so that weighting on central questions will adjust the estimates nearly perfectly (Voogt 2004)
Disadvantages • If the data collection process takes too long mode effects can be expected • For sensitive or complex questions also mode effects can be expected
Conclusions • The scientific basis for generalization in Volunteered Opt-in panels is very questionable • One mode cross sections and Web panels are plagued by low response rates • Cross sectional research of web users may be possible in the future • At this moment mixed mode data collection maybe a solution for simple nonsensitive topics