480 likes | 558 Views
GfK NOP Social Research. GfK. Growth from Knowledge. RDD Sampling for Telephone Surveys Nick Moon, GfK NOP Social Research. 1. Early History of RDD. 2. First successful method. Agenda. 3. Becoming mainstream. 4. A new challenge. 5. Future challenges. The Theoretical Case.
E N D
GfK NOP Social Research GfK. Growth from Knowledge RDD Sampling for Telephone Surveys Nick Moon, GfK NOP Social Research
1 • Early History of RDD 2 • First successful method Agenda 3 • Becoming mainstream 4 • A new challenge 5 • Future challenges
The Theoretical Case • Telephone generally cheaper than face to face • Unclustered no dearer than clustered • Possible advantages for sensitive questions • More questions per minute • Little evidence of mode effects • Shortage of face to face fieldwork
Telephone Interviewing • Took off from early 1980s • Rapidly increased proportion of total research volume • But very rarely used for social research • Initially concerns about penetration • penetration >95% • Concerns also about sampling • Need for probability sampling
Probability Sampling • Needs a sample frame with 100% coverage • Only frame for telephone is the telephone directories • Considerably <100% coverage • High and growing ex-directory rate • UK 36% overall London 50%+ (1999)
Ex-Directory Subscribers • Urban • Younger • Female • Lowest social class groups • Higher income • Smaller households
A theoretical alternative • While addresses or name lists are infinite, the number of telephone numbers is finite • It should therefore be possible to generate numbers entirely at random from the known list of possible numbers
But not so good in practice • A ten digit telephone numbering system allows 1 billion numbers • only 25 million households in the UK • many numbers are business only and not residential • vast majority of possible numbers are not in use • The cost of finding live numbers can be prohibitive • Foreman and Collins (1991) • Lack of information about the numbering system
The first compromise • RDD is not cost-effective • Directory sampling is biased • Directory plus 1 in theory compensates for this bias • Later developed into directory plus n
Problems of Directory plus n • Unlisted numbers tend to be clustered • If 10 consecutive numbers are unlisted then directory + any number from the first one will yield no listed number • the chance of any number being selected depends on the proportion of the previous nine numbers that are in the directory
Number Propagation • Developed by BMRB in 1991 • Collect telephone numbers from all respondents on their Omnibus • This includes ex-directory numbers (though some will still refuse) • Generate numbers from n-3 through to n+3 • Has advantages over directories, but still effectively plus n rather than RDD • BMRB now use RDD
Telephone Interviewing • The 01727 exchange allows 1 million numbers, for a town of 50,000 people • But all start with either 8 or 7 • This immediately reduces total possible to 200,000 • More information now available from Ofcom over and above exchange listings • Can identify blocks of 100,000 numbers that you know exist • Working blocks still contain huge numbers of non-allocated numbers that must be winnowed out
2 The First Success
The US example • Random Digit Dialling (RDD) used widely in the US • Initially Mitofsky-Waksberg • Subsequently list assisted
Mitofsky Waksberg 1 • All US numbers are in format 123-456-7890 • All possible combinations 123-456 known • Draw random sample of area code+local exchange+ random number from 00 to 99 as primary sampling units • The “100 Block” • Generate at random a number from 00 to 99 to produce one full number per psu • Telephone each of these numbers
Mitofsky Waksberg 2 • If that number rings, assume the block is in use, and keep in sample • If that number doesn’t ring, assume the block is not in use, and reject from sample • Because the chance of one selected number ringing is higher the more working numbers there are, then this is effectively pps sampling
Mitofsky Waksberg 3 • Since psu’s are selected pps, there should be a constant number of sample members per psu • Generate more random numbers from 00 to 99 to produce more numbers within the same 100 Block • Ring these numbers until the prescribed number of WORKING numbers has been reached • These are then treated in the same way as any other random sample with cal-backs, no replacement etc
A quick quiz Paper title: “Forty eight red, white and blue shoestrings” Why?
Forty eight red, white and blue shoe strings 'Mac the Finger said to Louis the King, I got forty eight red, white and blue shoe strings, And a thousand telephones that will not ring, Tell me where I can rid of these things' Bob Dylan, Highway 61 Revisited
The Moon–Noble experiment • Translation of Mitofsky Waksberg to UK hampered by lack of transparency about UK numbering system • BT cited concerns about privacy • Also irregular number length (9 digit, 10 digit) • Empirical mapping of number system using large scale tele-marketing databases • 15 million numbers, so can map nearly the whole system • Identifies nearly all 100 banks currently in use • Random sample of 100 banks for Mitofsky Waksberg approach • Yielded 66% working lines at second stage
Moon Noble Experiment conclusion • Clearly a success • Showed true probability telephone sampling could be done cost-effectively in the UK • Won Best Technical Paper award at 1998 MRS Conference • Sat back and waited for glory
3 Becoming Mainstream
A Very Short-lived Glory • Growth of competition in the telephone supply market • Increased role for Oftel (now Ofcom) • Far greater transparency about the number system • And standardisation of length • Now possible for US techniques to be applied to the UK • Commercial suppliers now make RDD samples available to all • Specialist agencies now sell RDD samples • Epsem • List-assisted • Pre-dialled with auto-dialler to weed out non-working numbers
Two Different Approaches • True epsem • Does not take number of directoried lines into account • List Assisted • Glorified version of directory plus n • Directory minus n digits with last n digits randomised • List assisted will always be biased against blocks with large numbers of working numbers of which very few are directoried • This may or may not matter • Epsem will always produce far more non-working numbers • Use of pre-dialing or “pinging” to remove non-working lines
Issues of Geography • Less than perfect match between telephone numbers and geography • Fine at exchange code level • Progressively less good at more detailed levels • No official record of geography of numbers • Postcodes no longer printed in telephone directories • Commercial suppliers will sell samples based on various geographical levels • Assign numbers to postcodes based on available databases • Reasonable at constituency/local authority level • Poor at ward level • Need to oversample for ineligibles and allow for screening • People don’t always know where they live
Implications of Geographical Issues • No real problem for national surveys • Doesn’t really matter if isolated case is in the “wrong” psu • Considerable implications for local studies • Lose some of the cost advantages of telephone interviewing • Especially problematic for clustered samples • Such as locating areas of high density of BME populations
Evidence from the Staffordshire SE experiment • Allocation of numbers not geographically rational • The vast majority of numbers were in the format EEEEE-Lxxxxx • Where EEEEE was the exchange and all started with the same sub-exchange number L • A small number were in the format EEEEE-Nxxxxx • Presumably the L sub areas was filling up and a new sub-area N came into use • One might expect these new numbers to run sequentially from EEEEE-N00000, EEEEE-N00001, EEEEEN00002 etc • In fact they were scattered over huge range of numbers
Implications of this number scatter • The core of any RDD sampling approach is blocks of 100 or 1000 numbers • We have already examined the implications of different densities of directoried numbers per block • Relative density of working numbers per block is also important • Scatter of the Staffordshire kind will lead to a density distribution with a hugely long tail • In theory pps sampling should take care of this • A small proportion of very low-density blocks will still get selected • But most RDD systems rely on finding a fixed, and non-trivial, number of working numbers per block • Nicolaas (2001) suggests very low density blocks might be safely ignored
Sampling Individuals • A telephone is usually a household item • But we want a sample of individuals • Quota approach can take whoever answers the phone • Old social rules on who answers phone have disappeared • New ones appeared in their place • Especially in households with teenagers • Random selection is ideal
Random Selection of Individuals • Kish Grid is the gold standard • Enumerate household in set order • Take nth person list according to randomised procedure • Requires enumeration at very beginning of interview • Felt to cause potentially higher refusal rates • NatCen experiment suggests this may not be so • Next/last birthday method a simpler compromise • NatCen experiment shows similar profile to Kish grid • The Rizzo/Brick/Park variant even easier • How many adults in household? • If one continue interview • If two randomly select either the person who answered the phone or the other one • If three or more go through next birthday method
Weighting the data • People in larger households have less of a chance of selection than those in smaller ones • Same principle as face to face surveys using PAF • Effective sample size only c80% of actual one • Households with >1 line have >1 chance of selection • Need question on number of lines • But if the line is used only for a modem/fax then it shouldn’t count • Need question on lines that ring • All adds to the cost
A Cautionary Note • There’s more to random sampling than just sampling • Many agencies buy RDD samples but treat them as leads for a quota sample • If you get a response rate of 15% does it matter if the sample is random?
4 A New Challenge
2003 Communications Act • Gives Ofcom the power to deal with silent calls • Primarily directed at tele-marketers • Unexpected impact on RDD
Silent calls • Caused by use of auto-diallers • Machine dials number • When it is answered machine hands it over to any available interviewer/salesman • If all are busy respondent gets “silent call” • Not cut off but no-one there
Impact of Silent Calls • Silent call rate can be set to any level on dialler • Above that rate and system stops making calls • The higher the rate the less interviewer salesman dead time • But more pissed-off recipients of silent calls • Lots of tele-marketers don’t care, MR firms generally do • GfK NOP silent call rate always 1% or under
Consequences for RDD • The only supplier of epsem samples temprarily ceased to do so • They were concerned pinging may be against the new law • Without pinging epsem is not cost-effective • Now supplying pinged sample again - BUT • The main clients for true random samples are government bodies • Concern about seeming to be against spirit of legislation
5 Future Challenges
Mobile Phones • Only 1% of households have no phones • But 8% have only mobiles and no landline • These are being missed from RDD surveys • Does it matter?
Problems of Mobile Phones • Mobiles are individual • Landlines are household • Mobiles may be provided by employer • Mobiles may be answered while driving • NB in the US many cell phone users pay to receive calls
Mobile-only households • Younger • 38% of those with a mobile but no landline are 18-24 • Down-market • 58% of those with a mobile but no landline are social class DE • More likely to be in full time education • More likely to be unemployed
Dealing with Mobiles • Mobile-only households will reach a level where they can’t be ignored • Mobile numbers identifiable • Could sample in the same way as landlines for RDD • Multiple chances of selection • Need to ask for number of landlines and number of mobiles • No geographical data
The Future’s Orange/Virgin/O2/Vodaphone • Further Changes to the Numbering System • Particularly influenced by competition among suppliers • Ofcom keen to reduce barriers to competition • Number Portability • Must be same broad area BUT • VOIP • Mobile/landline hybrids • Wireless Office
Implications of future changes • The link between telephone number and geography may break down completely • National samples can still be drawn by simple random sampling, but with no benefits of stratification • Local sampling by telephone may become impossible • Cost may force local surveys onto postal or online methodologies
The Paradox • The demand for high quality high volume social research is beginning to exceed the supply of suitable face to face fieldwork • This is diverting social research to telephone, that previously would only ever have been conducted face to face • This research demands high quality epsem samples, at the very time that it is being more difficult to achieve them • What’s wrong with paper self-completion anyway?