E N D
1. 1 Surveys Overview
2. 2
3. 3 Surveys – a visual representation
4. 4 Twenty Questions A Journalist (Researcher) Should Ask About Poll (Survey) Results: Motivating Discussion?
TVNZ Colmar Brunton Poll
Who did the poll?
Who paid for the poll and why was it done?
How many people were interviewed for the survey?
How were those people chosen?
What area (nation, state, or region) or what group(teachers,lawyers, Democratic voters, etc.) were these people chosen from?
Are the results based on the answers of all the people interviewed?
Who should have been interviewed and was not?
When was the poll done?
How were the interviews conducted?
What about phone in polls or polls on the Internet?
5. 5 Twenty questions… Twenty Questions A Journalist (Researcher) Should Ask About Poll Results…
What is the sampling error for the poll results?
Who’s on first?
What other kinds of factors can skew poll results?
What questions were asked?
In what order were the questions asked?
What about "push polls"?
What other polls have been done on this topic? Do they say the same thing? If they are different, why are they different?
So I've asked all the questions. The answers sound good. The poll is correct, right?
With all these potential problems, should we ever report poll results?
Is this poll worth reporting?
6. 6 Latest TVNZ Colmar Brunton poll
7. 7 Latest TVNZ Colmar Brunton poll…
8. 8 Latest TVNZ Colmar Brunton poll…
9. 9 Representativeness & Political Polls In a democracy we choose our representatives by allowing the electorate to vote for them or not
A poll attempts to ascertain who people would elect by inspecting a sub-sample of all possible voters
In order for this to be accurate it needs to be representative of this electorate
10. 10 Then, What Is a Survey? Today the word "survey“* is used most often to describe a method of gathering information from a sample of individuals. This "sample" is usually just a fraction of the population being studied.
For example:
a sample of voters is questioned in advance of an election to determine how the public perceives the candidates and the issues ...
a manufacturer does a survey of the potential market before introducing a new product ...
a government body commissions a survey to gather the factual information it needs to evaluate existing legislation or to draft proposed new legislation.
* note that a poll is just a special type of survey about involving public opinion/potential voting behaviour.
11. 11 Then, What Is a Survey?... Not only do surveys have a wide variety of purposes, they also can be conducted in many ways
including over the telephone, by mail, or in person.
Nonetheless, all surveys do have certain characteristics in common.
Unlike a census, where all members of the population are studied, surveys gather information from only a portion of a population of interest
the size of the sample depending on the purpose of the study.
12. 12 Then, What Is a Survey?... In a bona fide survey, the sample is not selected haphazardly or only from persons who volunteer to participate.
It is scientifically chosen so that each person in the population will have a measurable chance of selection.
This way, the results can be reliably projected from the sample to the larger population.
Information is collected by means of standardized procedures so that every individual is asked the same questions in more or less the same way.
The survey's intent is not to describe the particular individuals who, by chance, are part of the sample but to obtain a composite profile of the population.
13. 13 Then, What Is a Survey?... Anonymity:
The industry standard for all reputable survey organizations is that individual respondents should never be identified in reporting survey findings.
All of the survey's results should be presented in completely anonymous summaries, such as statistical tables and charts.
14. 14 Who Conducts Surveys?- some examples Major TV networks rely on surveys to tell them how many and what types of people are watching their programs
Magazine and trade journals use surveys to find out what their subscribers are reading
Local transportation authorities conduct surveys to acquire information on commuting and travel habits
Statistics New Zealand
Fast moving consumer goods customers (eg: Watties, Cerebos Gregg's)
Service Industries (eg. Banking, Insurance)
Health agencies
15. 15 Questions asked with Surveys You can further classify surveys by their content. Some surveys focus on opinions and attitudes (such as a pre-election survey of voters – or poll)
Others are concerned with factual characteristics or behaviours (such as people's health, housing, consumer spending, or transportation habits).
Many surveys combine questions of both types. Respondents may be asked if they have:
heard or read about an issue ...
what they know about it ...
their opinion ...
how strongly they feel and why...
their interest in the issue ...
past experience with it ...
and certain classification information (such as age, gender, marital status, occupation, and place of residence).
16. 16 Questions asked with Surveys… Questions may be open-ended ("Why do you feel that way?") or closed ("Do you approve or disapprove?"). Survey takers may ask respondents to rate a political candidate or a product on some type of scale, or they may ask for a ranking of various alternatives.
The manner in which a question is asked can greatly affect the results of a survey. For example, a recent NBC/Wall Street Journal poll asked two very similar questions with very different results:
(1) Do you favour cutting programs such as social security, Medicare, Medicaid, and farm subsidies to reduce the budget deficit?
The results: 23% favour; 66% oppose; 11% no opinion.
(2) Do you favour cutting government entitlements to reduce the budget deficit?
The results: 61% favour; 25% oppose; 14% no opinion.
17. 17 Questions asked with Surveys… The questionnaire may be very brief -- a few questions, taking five minutes or less -- or it can be quite long -- requiring an hour or more of the respondent's time.
Since it is inefficient to identify and approach a large national sample for only a few items of information, there are "omnibus" surveys that combine the interests of several clients into a single interview. In these surveys, respondents will be asked a dozen questions on one subject, a half dozen more on another subject, and so on.
Because changes in attitudes or behaviour cannot be reliably ascertained from a single interview, some surveys employ a "panel design," in which the same respondents are interviewed on two or more occasions.
Such surveys are often used during an election campaign or to chart a family's health or purchasing pattern over a period of time.
18. 18 What Are Potential Concerns? The quality of a survey is largely determined by its purpose and the way it is conducted.
Most call-in TV inquiries (e.g., 900 "polls") or magazine write-in "polls," for example, are highly suspect.
These and other "self-selected opinion polls (SLOPS)" may be misleading since participants have not been scientifically selected. Typically, in SLOPS, persons with strong opinions (often negative) are more likely to respond.
The ‘Holmes show’ is representative of all NZers
Although this information may still be useful as a ‘weather vane’ of peoples opinion it cannot be used to represent the greater public’s view point
19. 19 What Are Potential Concerns? Surveys should be carried out solely to develop statistical information about a subject.
They should not be designed to produce predetermined results or as a ruse for marketing and similar activities.
Anyone asked to respond to a public opinion poll or concerned about the results should first decide whether the questions are fair.
Another important violation of integrity occurs when what appears to be a survey is actually a vehicle for stimulating donations to a cause or for creating a mailing list to do direct marketing.
20. 20 The TVNZ Colmar Brunton poll revisited One News/Colmar Brunton Poll
Poll Method Summary
RELEASED: Sunday 11th July 2004
POLL CONDUCTED: Evenings of 5-8 July 2004 inclusive
SAMPLE SIZE: N = 1000, Eligible Voters
SAMPLE SELECTION: Random nationwide selection using a type of stratified sampling to ensure the sample includes the correct proportion of people in urban and rural areas.
SAMPLE ERROR: Based on the total sample of 1000 Eligible Voters, the maximum sampling error estimated is plus or minus 3.2%, expressed at the 95% confidence level.
METHOD: Conducted by CATI (Computer Assisted Telephone Interviewing).
WEIGHTING: The data has been weighted to Department of Statistics Population Estimates to ensure it is representative of the population in terms of age, gender, and household size.
REPORTED FIGURES: Reported bases are weighted. For Party Support, percentages have been rounded up or down to whole numbers, except those less than 5% which are reported to 1 decimal place.
METHODOLOGY The party vote question has been asked unprompted as at February 1997.
NOTE: The data does not take into account the effects of non-voting and therefore cannot be used to predict the outcome of an election.
Undecided voters, non-voters and those who refused to answer are excluded from the data on party support. The results are therefore only indicative of trends in party support, and it would be misleading to report otherwise.
Publication or reproduction of the results of this poll must be acknowledged as the “One News Colmar Brunton Poll”.
21. 21 Some Quotes About Democracy. "...Government of the people, by the people, for the people, shall not perish from the Earth." -Abraham Lincoln, Gettysburg Address, November 19, 1863
"It is a besetting vice of democracies to substitute public opinion for law. This is the usual form in which masses of men exhibit their tyranny." -James Fenimore Cooper
22. 22 The Survey Process Many stages in the survey process
Goal-setting; survey design; sampling; data collection, capture, cleaning, and analysis; reporting survey results; and decision-making based on these results
Different people (and often different organisations) are involved at different stages
Good communication is therefore important
E.g. Don’t want to be recoding data after you have received it
Statistics is useful at many points during a survey
Especially sampling, weighting, and data analysis
23. 23 Statistics in Survey Design and Execution This course mainly focuses on analysis techniques
Excellent background material to be found:
http://www.amstat.org/sections/srms/
whatsurvey.html
24. 24 Goal-Setting Vital to define research question or survey objectives
May involve specific accuracy requirements
For example, relative margin of error < 10%
in which case what sample size would be need to be for a proportion??
25. 25 Goal-Setting Level of formality usually depends on:
Whether survey sponsor and researcher are separate people or organisations
Whether the survey process must be transparent
Usually the case when there are many stakeholders with opposing interests; e.g. media or social/govt research
Importance of the survey
26. 26 Survey Design Survey design involves
Choice of data collection methodology
Usually face-to-face in-home or central location, telephone, mail, e-mail, WWW pop-ups, or some combination of these
Questionnaire design
Sample design & analysis planning
Estimating survey costs
Aim to achieve survey objectives cost-effectively
Usually involves trade-offs between cost, speed and accuracy
Note that accuracy reflects both systematic bias and random variance
E.g. in-home face-to-face vs telephone interviewing
27. 27 Sample Design and Selection Necessary unless survey is a census
Many design options
E.g. stratification, clustering, disproportionate sampling
Has major implications for data analysis (and sometimes for other stages)
Discussed in more detail later
28. 28 Survey sampling
29. 29 Survey Sampling Will cover
Basic concepts of sampling
Sampling frames commonly used in NZ
Implications of sample design for weighting and variance estimation
First, some sampling terminology
Probability (random), judgment, quota, and convenience samples
Unit, population, sample, sampling frame
30. 30 Sampling Methods An excellent site is http://www.soc.surrey.ac.uk/samp/
Several general methods exist for selecting samples:
Probability (or random) sampling is where each object has a known, non-zero probability of being selected
Can produce unbiased results (if no non-response)
Allows for calculation of sampling error (if pairwise selection probabilities known)
Most widely accepted sampling method. Strongest acceptance in USA.
Judgment sampling involves choosing objects that it is believed will give accurate results
E.g. three areas (one large city, one town, one rural)
31. 31 Sampling Methods (continued) Quota samples are based on selecting objects until you have a certain number (the quota) of each type
Appeals to idea of a “representative” sample
Can produce substantial bias (e.g. 1992 UK election polls)
Still widely used (especially for telephone surveys with high non-response levels)
Convenience samples are obtained by choosing the easiest objects available
E.g. the first ten people to walk out of a store
32. 32 Sampling Terminology Units are the objects to be surveyed
Usually people, households, businesses (enterprises or geographic units), customers or activities (e.g. trips, phone calls, nights stayed)
Survey population
The collection of units that the survey results should describe or explain
Sample
The subset of the population for which survey data is collected (or is intended to be collected)
Sampling frame
a method of contacting selected sample units, including the information needed to select them
E.g. a list of customers, including the value of their business
Need precise definitions of these for each survey
33. 33 Case Study
34. 34
35. 35 Benefits of sample surveys as a research method
Benefits of sample surveys as a research method:
Relatively cheap and fast
Valuable for finding out how people think or how they behave
36. 36 Sampling Frames Simplest example of a sampling frame is a list
For instance, a list of all NZ phone calls made in July
More generally, any procedure and data that effectively enables the selection of a sample
Good frames require development and maintenance efforts
E.g. Statistics NZ runs an annual survey (the Annual Business Frame Update Survey) simply to update their Business Frame
Most frames are imperfect, exhibiting
Undercoverage
Duplicated units (perhaps under different spellings or ID numbers)
Out-of-date or missing data
37. 37 Census
38. 38 Sampling Frame
39. 39 Sample Survey
40. 40 Sampling Frames forHouseholds and Individuals No list exists of all the occupied private dwellings in New Zealand
A variety of sampling frames have been developed to enable sample surveys of NZ households and individuals
Different frames for
Telephone surveys
Face-to-face, in-home surveys
Once households have been selected and contacted, ask who lives there and select eligible individuals from this list
Kish grid technique: see http://www.audiencedialogue.org/kya2c.html
Last birthday method (doesn’t require list of household members)
41. 41 Telephone Sampling of Households Undercoverage is a fundamental problem for telephone surveys of households
Only 92% of households have a land-line
Less than 80% of Maori or Pacific households
Households without phones are also different in other ways; e.g. they are generally low-income households
Duplicates also occur
i.e. some households have more than one phone number, and thus have more chance of being selected
Should ideally correct for this by sub-sampling or weighting the data, or by removing all duplicates from the frame before sample selection
42. 42 Telephone Sampling Frames White Pages
Telecom sells random samples of listed numbers
Unlisted numbers not included
So have lost another 15% of phone numbers
May be cheaper to use paper directories instead, but these are out of date (even when just distributed)
43. 43 Telephone Sampling Frames (continued) Random digit dialing (RDD)
Naïve approach
List all possible numbers, and select at random
Many non-working numbers - success rate <10%
Better approaches
E.g. Mitofsky-Waksberg
Take banks of possible phone numbers, and select phone numbers more intensively from banks that have larger proportions of listed numbers
Increased hit rate to 60% in US
Pseudo-RDD methods using banks centred on valid “seed” phone numbers are sometimes used
44. 44 Household Samplingfor In-Home Surveys Multi-stage approach widely used
Area sample
take list of areas and select sample of areas
38,366 meshblocks in NZ Geostatistical System
Household sample
Interviewers list all dwellings within selected meshblocks (following meshblock maps)
Sample of households selected in each area
Variations on this approach exist
Random route within area (i.e. route follows rules from random starting point), or ignoring area boundaries
45. 45 Business Frames Business Directory
Excellent frame held by Statistics NZ
Contained 278,000 non-farming enterprises in Feb ‘01
Not available for market research surveys
Other business frames are marketing databases
Dun & Bradstreet
Few duplicates expected – uses unique DUNS number
Useful auxiliary information
Number of employees, turnover, industry (ANZSIC), age
But substantial amount of missing data on turnover
Has contact names
46. 46 Business Frames (continued) UBD
Has some auxiliary information
Staff size and trade category (not SIC)
Has contact names
Substantial undercoverage – contains 156,000 businesses
Yellow Pages
Has industry (ANZSIC or Yellow Pages category), but not size
Probably has more duplications than other frames
Undercoverage – contains “250,000+” businesses
47. 47 Sampling Frames Summary Several possible frames exist for household, individual and business surveys in NZ
However many have severe flaws
Substantial under coverage
E.g. 60% coverage for UBD directory
White Pages similar in some demographic groups
Duplicates present in most frames
De-duplication of entire frame, subsampling or reweighting needed
Note that it is not enough to select a sample and then remove duplicates within the sample
Caution is advisable
48. 48 Sampling
49. 49 Sampling Have discussed different types of sampling
Quota, convenience, judgement and probability samples
Will now focus on probability sampling
Theoretical framework
Various probability sampling concepts
Stratification, clustering, unequal selection probabilities
Systematic sampling; multi-stage sampling
50. 50 Simple Random Samples Say we decide to take a sample of size n
If all the possible samples have an equal probability of being chosen, this is called a simple random sample (without replacement), or SRS for short
Can also take a simple random sample with replacement (SRSWR), but this requires a slightly more general sampling theory
51. 51 Estimating the Mean from an SRS Estimate the mean from the sample as
Then the variance of this estimate is
52. 52 Means under SRS (cont’d) These formulae can be used to produce valid confidence intervals if n is “large enough”
For approximately normally distributed data, n>50 is probably large enough
Percentages are special cases of means
However s2=pq is typically used
Need np>5 and npq>5 for valid CIs
53. 53 Other Sample Designs SRS are often too costly for practical use
Other sample designs are therefore needed
Stratified sampling
Split population into groups or strata
Sample independently within each stratum
Can use different sampling fractions within each stratum (or even various sample designs)
54. 54 Stratified Sampling (continued) Calculate weights as
Use these weights when analysing sample data
For estimates of totals, can calculate variances for each stratum and add these together to give overall variance
Means require a weighted average of the variances, where the weights are proportional to the square of the stratum size
If the sampling fractions are similar, this variance is usually smaller than the variance for an SRS of the same size
Due to smaller variance between cases within a stratum
55. 55
56. 56 Assume sample is a simple random sample.Estimate proportion of all women who think that they are overweight. Give a standard error for your estimate.
Suppose that in the population: 10% Asian, 50% European, 25% Maori, 15% Pacific Islanders. Use this information to get improved estimate.
57. 57 Cluster Sampling Typically, face-to-face household surveys involve interviewing several people in each area
This is an example of a cluster sample, where the areas are the clusters
This approach is much less costly than an SRS of the same size
However it will also exhibit higher sampling variability, due to correlations between interviews within a cluster
E.g. similar spending patterns due to similar incomes, or a similar range of products being available locally
58. 58 Variances under Cluster Sampling Variances are inflated under cluster sampling by a factor depending on
Cluster size (denoted m)
Intra-cluster correlation (denoted ?)
as follows:
Here the intra-cluster correlation coefficient ? is defined as the Pearson correlation coefficient between all pairs of distinct units in the sample
59. 59 Difference Between Cluster and Stratified Sampling Clusters, like stratum, group members of the population.
The selection process is different.
Stratified compared with srs increases precision
Cluster compared with srs decreases precision
Members if same cluster more similar than when selected at random
Eg Divide Akld into areas & sample a few areas – may get retirees, young families…
20 households in the same area not as likely to mirror diversity as well as 20 households selected at random. Cluster sampling partially repeats the same information – see mapClusters, like stratum, group members of the population.
The selection process is different.
Stratified compared with srs increases precision
Cluster compared with srs decreases precision
Members if same cluster more similar than when selected at random
Eg Divide Akld into areas & sample a few areas – may get retirees, young families…
20 households in the same area not as likely to mirror diversity as well as 20 households selected at random. Cluster sampling partially repeats the same information – see map
60. 60 Systematic Sampling Another commonly used sampling technique is systematic sampling
The population is listed in a particular order, then every kth unit is selected
Start at a random point between 1 and k
Here k is chosen so that N ˜ kn
Systematic sampling is a special case of cluster sampling, with only one cluster selected
This makes it hard to estimate sampling variances
Need prior knowledge or assumptions about response patterns
61. 61 More on Systematic Sampling Performance depends strongly on response patterns
Linear trend yields an implicit stratification, and works well
However cyclic variation of period k (or some multiple of k) can result in huge variability
Systematic sampling generalises easily to sampling with probability proportional to size
However large units may need to be placed in a certainty stratum, or selected more than once
62. 62 Multi-stage Sample Designs Many surveys use complex sample designs that combine several of the above elements in a multi-stage sampling framework
For example, face-to-face in-home surveys of people often employ three stages
Systematic pps sampling of areas
Cluster samples of households within areas
Random selection of one person from each household (unequal sampling probabilities)
63. 63 Complex Sample Designs Multi-stage designs may require complex estimation processes, especially for variance estimation
Specialised software is often needed
Different items in a questionnaire may refer to different units, from different sampling stages
E.g. Households and people
E.g. Customers and brands purchased
These will usually require different statistical treatment
E.g. different sets of weights for households and people
64. 64 After you’ve collected the data
65. 65 Data Collection Contact selected respondents
Unless data can be obtained ethically through observation or record linkage/data matching
Obtain completed questionnaires
Structured interview or self-completion
Statistics involved here in design decisions
E.g. quotas, scheduling interview times
Also quality control and improvement role
66. 66 Data Capture and Cleaning Data entry
From paper questionnaires or other records
Typically a (fixed) proportion are re-entered for quality control (QC) purposes – improvement possible here
Coding
Assigning labels (or codes) based on verbal descriptions
Data editing
Eliminate inconsistent data
Identify and treat outliers
Confirm data with respondent, or alter or even delete data
67. 67 Weighting and Imputation Weighting
Attaches a weight to each observation
Used to calculate weighted means, percentages
Often required to reflect sample design
Unweighted results would be biased
Also helps compensate for unit non-response
Unit non-response is when data is not obtained for some units, although they were selected as part of our sample
Weights are adjusted to align survey results with known population figures
Covered in more detail later
68. 68 Imputation Helps deal with item missing data
When certain items in the questionnaire are not available for all respondents, this is known as item missing data
Fills in gaps with sensible values
Allows standard methods for analysis of complete data to be used
More detail given later
69. 69 Data Analysis and Tables Many analysis techniques are available
Cross-tabulation is ubiquitous in market research
Tabulating one categorical variable against another, e.g. intended party vote by age group
Need to calculate random sampling error
Also known as variance estimation
Influenced by sample design and weighting
More on this later
70. 70 Reporting and Decision-Making Reporting results
Important that these are communicated clearly
Statistical input often vital
Should address survey objectives
Decision-making and action
Influenced by survey results (hopefully!)
Actions may include further research
71. 71 Statistics in Survey Research In summary:
Statistics is generally most useful in the design and analysis stages of surveys
Especially sampling, weighting, and data analysis
Also relevant at other stages
Quality control and quality improvement for survey operations
Effect of survey procedures on survey results
Interpretation of survey results
72. 72 Weighting Usually survey weights are calculated for each responding unit
Aim for unbiased weighted survey results
Or at least more accurate than without weights
Survey weights can adjust for
Sample design
Unit non-response
73. 73 Non-response – importance of incentives First Year Statistics Web Survey – Instructions:
Please answer all questions
Completion and submission of this survey by Friday, 12 March puts you into the draw for $50 worth of book vouchers (donated by UBS, the University Book Store)
Your ID is needed to enter you into the draw and it will not be stored with your responses.
I am a student at the University of Auckland.
I agree to take part in this data collection project.
I am over the age of 16 years.
I understand that once I submit my survey, I will not be able to withdraw it.
The information collected from this survey will be used only for data analysis examples and exercises in this course
Response rate ~50% STATS20x Web Survey - Instructions
Please answer all questions:
Completion and submission of this survey by 4pm Friday 12th March will gain you credit for Assignment 1
Your ID is needed so you can be awarded the marks for Assignment 1
Your ID will not be stored with your responses
The information collected from this survey will be used only for data analysis for examples and exercises in this course
Response rate ~90% (worth 1% max of total grade)
74. 74 Weighting for Sample Design Need to adjust for varying probabilities of selection
No need if selection probabilities are equal
Varying selection probabilities arise from
Stratification
Selecting one person per household
Double sampling, e.g. for booster samples
May need to truncate weights if highly variable
Introduces some bias, but reduces variance markedly
75. 75 Weighting for Unit Non-Response Response rates in NZ market research surveys usually between 20% and 60%
Lower for telephone surveys, higher for face-to-face surveys
Gradually decreasing
Non-response can cause bias, if non-respondents would give different answers from respondents, on average
For linear statistics, can express non-response bias as the product of this difference times the non-response rate
76. 76 Post-Stratification Post-stratification is probably the most common method of adjusting for non-response
The sample is divided into a set of post-strata
This is similar to setting up strata for a stratified sample, but is done after data collection is complete, and so can use data collected during the survey
Note: these weights depend on the random sample and so are random themselves
Sample skews relative to known population figures are then corrected, by adjusting the weights to align survey results with the population figures for each post-stratum
This can reduce sampling variability as well as non-response bias
77. 77 Post-Stratification Example
78. 78 Rim Weighting Also known as incomplete post-stratification and raking ratio estimation
Allows control for more than one set of post-strata
Iterative method
Apply post-stratification to each set of post-strata in turn, until all have been aligned once
Repeat last step until all are within allowable tolerances
Both post-stratification and rim weighting can be applied to data with existing weights, such as inverse probability weights
79. 79 Weighting and Sampling Error Moderate post-stratification can improve the reliability of survey results (i.e. decrease sampling error)
However using post-strata with small sample sizes can lead to extreme weights and excessively variable survey results
A variety of recommended minimum post-stratum sizes can be found in the literature, ranging from 5 to 30. Caution probably suggests aiming for the upper end of this range (as a minimum).
Similar problems can also affect rim weighting, even if all the explicit post-strata are large
May be due to implied constraints affecting a small number of respondents
80. 80 Data checking and Imputation
81. 81 Data Checking & Editing Consistency checks
Ideally would do this during data collection
Limited real-time checks possible with self-completion questionnaires or pen and paper interviewing (PAPI)
Computer assisted interviewing (CAI) allows broader checks
Checking for outliers
Range checks – based on subject matter expertise
Check % of overall total coming from each case
Multivariate statistics – e.g. MV t-statistics
Cluster analysis – any tiny clusters
82. 82 Editing Data Recontact (if necessary) and ask again
Replace with “unknown” or “missing” code
Replace with sensible values (i.e. impute)
Can be done manually
Sometimes difficult to replicate or interpret results
Several (semi-)automatic methods available
Will discuss these soon
Need to document what was done
83. 83 Missing Data Distinguish unit and item non-response
Unit non-response – no data for some respondents
Item non-response – have some data, but not for all items
Typical causes
Respondent unwilling to provide data – e.g. income
Respondent unable to provide data – e.g. can’t recall
Could not contact desired respondent
Data collection or processing errors
Inconsistent or unbelievable data found through checks
84. 84 Non-Response Models - Notation First, a little notation
Y is the variable of interest
X is other observed data
R is response indicator variable
R=1 if Y observed
R=0 if Y is missing
We are interested in P(R=0 | X,Y)
Non-response probability given X and Y
85. 85 Non-Response Models Data missing completely at random (MCAR)
P(R=0 | X,Y) = p
Non-response probability does not depend on the value of Y or other observed data X
Data missing at random (MAR)
P(R=0 | X,Y) = p(X), where p(X) is some function of X
Non-response probability depends only other observed data X, not on the value of Y
Both MCAR and MAR are what is known as ignorable non-response models
Non-ignorable non-response is when P(R=0 | X,Y) = p(X,Y)
Non-response probability depends on Y, not just on X
86. 86 Methods for Missing Data Unit non-response – re-weight data
Rest of this section focuses on item non-response
Listwise deletion of missing data
Delete any observation with a missing value for any of the variables being analysed
Assumes omitted cases are similar to remaining cases – true for MCAR data, but often this assumption doesn’t hold
E.g. Omitting undecided voters implicitly assumes that they will split their votes in the same proportions as voters who have decided
Can be inefficient even if MCAR assumption holds
E.g. multiple regression with 15 predictors, each missing 5-10%
Over 50% of cases omitted from analysis
87. 87 Methods for Missing Data Pairwise deletion
Works for analyses that break down into sub-analyses that only use two variables at once
E.g. correlation matrix, factor analysis, CHAID or CART
For each sub-analysis, only remove those cases with missing data for one of the two items used
Can also be severely biased, and even lead to self-contradictory results
Some analyses can handle missing data directly
E.g. latent class models
Can report missing values as an extra row or column in tables – e.g. “Don’t Know” or “Refused”
88. 88 Imputation Methods Impute to fill in missing data, then analyse resulting complete data in the usual way
Ideal: impute once, do many analyses
Imputation requires some statistical expertise
Many imputation methods have been developed
Each method gives unbiased results (for certain analyses), assuming some non-response model holds
Even when main results are unbiased, special methods are needed to get unbiased variance estimates (and confidence intervals etc.)
89. 89 Mean Imputation Mean imputation
Impute mean value for all missing values
Gives sensible overall mean (assuming data MCAR), but distorts distribution
Impute mean + simulated error
Impute mean + random residual
Impute mean within imputation classes
Only assumes data MAR (where X=imputation class) when estimating means
Can generalise all the above methods to incorporate ANOVA or regression models
90. 90 Hot-deck Imputation Random hot-deck imputation
Divide data into imputation classes
Replace each missing value with the data from a randomly chosen donor in the same class
Assumes MAR (where X=imputation class)
Preserves distribution within classes
However only works well for moderately large imputation classes (preferably 30+, depends on nature of Y distribution given X)
Also multivariate (X,Y) relationships are hard to handle
91. 91 Hot-deck Imputation Nearest neighbour hot-deck imputation
Choose from most similar donors available, based on a multivariate distance function
Can choose best match, or randomly from k best
Can limit donor usage by including penalty for heavy usage into distance function
Allows for multivariate (X,Y) relationships
Not limited to a specific statistical model
Can be less efficient than methods that do assume a specific model, but is more robust
92. 92 Multiple Imputation Aims to allow valid inference when certain imputation methods are used (Rubin 1987)
Method
Impute multiple values using same imputation procedure
Analyse each resulting dataset, recording results including variance estimates
Combine the results to give overall variance estimate, and use this for inference
93. 93 Multiple Imputation Whether this works depends on the data, the analysis being carried out and the imputation method used
When it works, the imputation method is called “proper” (for that analysis procedure and dataset)
However it is difficult to know whether it works for a particular analysis
Current advice is to use a wide selection of X variables when imputing, including all possible analysis variables and design factors
Other methods have been developed for correct imputation inference – more details later
94. 94 Variance estimation
95. 95 Variance Estimation Sampling variation depends on the estimator, sample design and sample size
Many market researchers believe it depends only on sample size – e.g. net percentages
Standard variance formulae available for most analysis methods
Typically assume SRS or SRSWR
However these formulae do not work for the sample designs used in most MR surveys
96. 96 Classical Approaches Variance formulae have also been developed for some estimators under a wide range of sample designs
See books by authors such as Cochran and Kish
1950’s to 1970’s
Design effect
Ratio of actual variance to variance assuming SRS of same size
Typically varies from one item to the next
Usually under 2 for household surveys, but sometimes more
Can be much higher for other surveys – e.g. 25 for some items in NZ Adverse Events Study
97. 97 General Methodsfor Variance Estimation Variance formulae may already be available
If not, there are several general methods for variance estimation for complex surveys
Linearisation
Random groups
Resampling methods
Balanced repeated replication (BRR)
Jackknife
98. 98 Linearisation Derives variances for non-linear statistics from variances (and covariances) for means or totals
Use Taylor’s theorem to approximate around the estimate by a linear function
See Lohr (1999) for formulae and examples
Widely used; common analyses are implemented for many designs in software such as SUDAAN
Only works for smooth functions of totals
E.g. not for medians or other quantiles
Also difficult to apply for complex weights
Can produce variances that are too small for small samples
99. 99 Random Groups Original idea (Mahalonobis 1946)
Select several independent samples using the same sample design: “interpenetrating samples”
Calculate survey results for each sample or replicate
Use variation between the results from each of these samples to estimate the variance
Not usually practical to draw enough separate samples
Need >=10 samples to get accurate variance estimates
Instead, draw one sample and divide it into groups with each being a miniature version of the whole
100. 100 Random Groups Variance estimates given by
No special software needed
Works for quantiles and non-parametric statistics
But can be difficult to set up the random groups, and the sample design may restrict how many can be formed
101. 101 Resampling Methods Take several subsamples from the whole sample
Estimate variances as for random groups (but with different multipliers)
Variations include
Balanced repeated replication (BRR)
Jackknife
Bootstrap
Same procedure used for all statistics for a given sample
Can handle weighting easily, by reweighting the data for each subsample
102. 102 Balanced Repeated Replication Suppose two units are selected from each stratum (in the first stage of sampling)
I.e. 2 primary sampling units (or PSUs) per stratum
More general designs can be accommodated, with some difficulty
Can create 2 random groups, where
the first is formed by randomly selecting one unit from each stratum, the other from the rest
Can create 2H sets of groups this way
Usually this is many more than necessary
Choose a balanced subset of these groups
Appropriate design matrices given by Wolter
Calculate variance estimates using multiplier 1/R
103. 103 Jackknife Groups are formed in the delete-1 jackknife, by deleting each PSU in turn
So if there are l PSUs, l groups are formed
Also usually adjust weights in the current stratum slightly
Variance estimates are calculated using the multiplier l/(l-1)
Several variations available
E.g. delete-a-group jackknife
Adjustments to imputed values, to estimate imputation variance
Easily handles designs with >2 PSUs per stratum
Works well for smooth functions of means or totals
But does not work well for quantiles
104. 104 Bootstrap Take many samples (with replacement) from within each stratum
These should be drawn independently, in a way that reflects the original sample design
Usually some reweighting is needed
Applying the bootstrap to complex samples is still relatively new, and much research is still being done on how best to use it
Works for non-smooth statistics such as quantiles
But requires many more replicates than BRR or the jackknife
105. 105 Variance Estimation Software SUDAAN – mainly uses linearisation methods
WESVAR – mainly uses BRR and the jackknife
SAS –now handles some common statistics and sample designs, using linearisation methods
VPLX – free software, based on replication approaches (primarily the jackknife)
Several other packages available
See http://www.fas.harvard.edu/~stats/survey-soft/survey-soft.html for details
106. 106 Variance Estimation - Summary Important to realise that sample design affects sampling variation
Many methods to calculate correct sampling errors
Have given a quick overview here
Off-the-shelf software can handle some common situations
More complex estimators or weighting methods, or situations involving imputation, will usually require customised approach
Be careful – this can be quite a technical area
Easy to make significant mistakes
Best to get good advice when beginning to plan the survey if variance estimation is needed (i.e. at the sample design stage)