380 likes | 583 Views
Sampling rare and hard-to-reach groups in the UK Patten Smith Ipsos MORI. Session outline. Briefly summarise major ways in which samples can be hard-to-reach Outline approaches that can be taken for each Emphasis will be on methods which reflect principles of random probability sampling
E N D
Sampling rare and hard-to-reach groups in the UKPatten SmithIpsos MORI
Session outline • Briefly summarise major ways in which samples can be hard-to-reach • Outline approaches that can be taken for each • Emphasis will be on methods which reflect principles of random probability sampling • Will focus mainly on methods for identifying Ethnic Minority samples
What makes a group hard to sample? • Three main reasons: 1 Group is wholly or partly excluded from available sampling frames 2 Group is covered by a sampling frame that is hard to access 3 Group is on a sampling frame but is relatively rare and not separately identified
Group is excluded from sample frame • May be because do not appear on useable frame at all, usually because of mobility - eg: • Rough sleepers • Travellers • Hostel residents • Or because potential frames not available at time needed – eg locally administered Govt. funded projects
If excluded from sample frame • What to do: • Create new sample frame • Use non-probability methods (last resort)
Creating own frame • How to do this will depend on sample type: • eg for rough sleepers in a town, might divide town into grid squares and draw sample of these: then enumerate all eligibles identified in sampled squares • For museum visitors cover all exits and take every nth; no list used – but implied sample frame is all people passing through the exits • For hostel residents list bed spaces and sample occupants via these • For young people having appointments with a Connexions Personal Advisor (PA) – recruit PAs and ask then to complete short sampling form for each young person they have an appointment with during a defined reference period
Group is covered by a sampling frame that is hard to access 1 • Examples: • hospital patients (requirement for ethical clearance) • University students – ethical concerns by Higher Ed. institutions • children in care (lists held by local authorities) • Convicted offenders • Members of support groups - eg AA, Narcotics Anonymous • Residents of old people’s homes
Group is covered by a sampling frame that is hard to access 2 • What to do: • Depends on the frame, who holds it and why it is hard to access • Requires negotiation! • Sometimes can use frame if use opt-out procedure • Sometimes frame holder will do the sampling and initial contacting on your behalf • Sometimes can gain access with suitable government department /other high level support- eg use of Child Benefit records and IDBR • NHS work: formal ethical clearance - can take some months
Group is covered by a sampling frame that is hard to access 3 • If still no access, us screening procedures or resort to non-probability methods
Group is on sample frame but rare and not separately identified • Probably the most common situation in practice • Examples: • ethnic minority groups • unemployed people • males aged 85 and over • people with low qualifications only
Group is rare, on sample frame but not separately identified • Requires screening • Need to start with a sampling frame with good coverage of the population to be sampled • Select large sample of units from frame and screen into eligible and ineligible units • Several screening methods can be used • Mostly focus on screening for general population samples of ethnic minorities, but much of this can be generalised
Screening methods • Use earlier survey, omnibus survey, access panel , etc • For some ethnic / religious groups can use list name matching (but no longer feasible for national samples) • Screen in the field: • post/telephone • door to door • focused enumeration
Using earlier surveys • If you have access to recent survey that identifies eligible people can follow these up – for example: • using British Crime Survey to identify victims of domestic violence • using 1999 Health Survey for England respondents to identify sample for EMPIRIC survey (Ethnic Minority Psychiatric Illness Rates in the Community)
Use omnibus surveys or access panels • Add a question to identify eligible people - eg vegetarians, parents of children who live apart from the other parent • But omnibus surveys / access panels often do not use random probability sampling methods
Screening names on lists1 • Scan lists for names that are associated with particular ethnic / religious groups • Effective for some groups - eg those of Indian, Pakistani and Bangladeshi origin
Screening names on lists 2 • But: • Only works for groups with distinguishable names (eg not for those of Caribbean origin) • Assumes minimal intermarriage involving name change • Nowadays, cannot use this method for samples taken from population at large; no suitable list of names since Electoral Registers' use for surveys was restricted
Screening in the field: postal/telephone methods • Postal: • Post short screening questionnaires to large start sample (eg PAF) • Often not done because low response rates: but can be used in combination with other methods • May prove difficult for some types of screening (eg for ethnic minority people) because no interviewer to reassure/explain • Telephone screening: • Although can now draw random (RDD) population samples response rates generally low • Telephone methods might work better on special populations – eg membership lists
Door to door screening 1 • Generate general population sample from PAF • Interviewer visits each sampled address to establish eligibility of occupants • Either interview those identified as eligible there and then or return on another occasion (latter allows sub-sampling, interviewer matching, etc)
Door to door screening 2 • Safest way of identifying eligible people • But expensive - especially in low concentration areas: eg to obtain sample representative of non-white HHs in Britain would require approx. 30 or more addresses to be issued for screening for each achieved HH interview • Improve cost-effectiveness in two ways: • taking advantage of concentration • focused enumeration
Taking advantage of concentration 1 • Costs of door to door screening less if higher eligibility rate • For example, with 10% deadwood addresses, 80% screen response rate, 75% main interview response rate: • if 5% of HHs eligible, issue 38 to achieve 1 • if 20% of HHs eligible, issue 9 to achieve 1
Taking advantage of concentration 3 • Table shows, for example, that 86% of ethnic minority individuals lived in wards in which 5%+ of population ethnic minority • If prepared to limit findings to 86% of the ethnic minority population, can reduce number of addresses screened from 30 to 12 times achieved sample • But sample biased: no coverage of the 14% living in low concentration areas • Also method only as good as concentration figures: 2001 Census out-of-date
Taking advantage of concentration 4 • In principle can apply this logic to any characteristic which varied in concentration across different areas • For example: sample from high unemployment / high deprivation areas to identify low income families
Focused enumeration 1 • Involves screening by proxy - from neighbouring addresses • Significantly cheaper than door-to-door screening in areas of lower concentration • Can be used for any visible minority; mostly on ethnic minorities, but could be used for (say) households containing children • Used in a number of high profile surveys, notably the Fourth National Survey of Ethnic Minorities, the British Crime Survey and the Home Office Citizenship Survey • Various versions of the method have now been used
Focused enumeration 2 • 4th National Survey method: • Draw sample comprising large clusters of adjacent addresses • Visit every nth (eg 6th) address (“location” addresses) and ask about ethnic origins of people living (i) at location addresses (ii) the n-1 addresses to the left and the n-1 to the right • Substitutions for location addresses allowed under defined circumstances
Focused enumeration 3 • If positive enumeration given for any address to the left or to the right, the interviewer calls at all intervening addresses in the relevant direction • Each address screened twice once from each of two location addresses • Visit intervening addresses if positive identification from either address or if two “don’t knows”
Focused enumeration 5 • Once addresses containing eligible people identified, more detailed information collected about occupants • Special rules for street corners, flats, rural areas, etc
Focused enumeration 6 • Basic method adapted to allow ethnic minority boost sample to be added to an existing survey – now much more commonly used • This involves asking at main survey sample address about about eligibility of those living at the n addresses to the left and at the addresses to the right (n is commonly 2 – eg on BCS) • Either interviewers identify neighbouring addresses or pre-select from PAF • Note, each address only asked about once - not twice
Focused enumeration 7 Example of focused enumeration from a main sample address Interviewer screens addresses 3, 4, 6 and 7 from main sample address 5
Focused enumeration 8 • Independent analyses by me and by NatCen indicate that focused enumeration fails to identify c. 30% of eligible households • Still probably better than not covering ethnic minorities at all in low concentration areas
Non-probability methods • If group not on frame, frame is not useable may have to resort to non-probability methods • Three main approaches: • Quota sampling • Snowballing • Sampling at known points of congregation / through organisations
Quota sampling 1 • Use group defining feature (eg ethnic origin) in conjunction with other demographic characteristics (eg age, sex, working status) to set quotas • Common to select randomly clusters and then select respondents using quotas
Quota sampling 2 • Relatively easy and cheap to implement, but problems: • Risk of unquantifiable bias as with all quota samples • Bias depends on (mainly unknown) correlations between quota variables, survey variables and propensity to be interviewed • Population totals (upon which quotas based) not always available and can quickly become out of date (but can use survey data - eg LFS as substitute)
Snowballing 1 • Interview eligible individuals obtained from any source • At end of interview ask for names and contact details of other eligibles • Add any newly identified people to sample • Continue until interviews attempted with everyone and no new names identified
Snowballing 2 • Problems: • Because can only ask for names and details after interview, only works if prepared to interview whole eligible population in area - cannot be used to generate frame from which sample selected • (If stopped snowballing after enough interviews achieved, sample would be grossly biased) • Cross-checking names administratively complex • Bias against those who do not mix with other eligible people • But: • Work on "Respondent Driven sampling" in past few years offers possibility of better estimates using snowball like method
Sampling at known points of congregation • Examples: • Visit gay bars to find sample of gay men • Visit Muslim community centres to obtain sample of Muslims • Visit organisations for people with disabilities for sample of people with disabilities • Problems: • Those who visit points of congregation will be different from those who don’t
General conclusion • General moral: there is no cheap and easy way to get good quality minority samples - unless they are pre-identified on a sample frame • Which is why so many surveys of minorities use poor quality samples – with potentially very misleading results