370 likes | 385 Views
Lecture 11 of 47C5 Social Research Process I:. Sampling in Quantitative Research I Paul Lambert, 14.10.03, 4-5pm. 47C5: Survey research lectures. Resources for lectures 8,9,11,12. Lecture slides on WebCT site 2 Reading lists: Initial list in 47C5 unit outlines
E N D
Lecture 11 of 47C5 Social Research Process I: Sampling in Quantitative Research I Paul Lambert, 14.10.03, 4-5pm
Resources for lectures 8,9,11,12 • Lecture slides on WebCT site • 2 Reading lists: • Initial list in 47C5 unit outlines • Some additions on further list on WebCT site
Web Resources for lectures 8,9,11,12 • Slides and additional reading list also at: http://staff.stir.ac.uk/paul.lambert/teaching.htm • Some other internet resources (cf De Vaus 2002) http://trochim.human.cornell.edu/kb/ http://statcomp.ats.ucla.edu/survey
Part 1: Role of sampling in survey research • Surveys can be census’s • More often samples from wider population • Several sampling methods select cases • Aim: representative of wider population
Inference • Key idea is inference = confidence in our ability to generalise Sampling inference = application of statistical theories in order to estimate probabilities that a sample result is ‘likely to have been unrepresentative’
Theories of sampling methods Sampling and probability theories tell us that any particular random sample is most likely to have the same properties as the wider population. We can then estimate the probability that sample results of a particular nature could have arisen by chance, rather than because they are the same as the population result.
If the cases in sample surveys were selected at random, then can use sampling theories and thus ‘inference’
‘Inferential data analysis’ • Variable-by-case matrix data analysis for generalising findings to population • Often distinguished from ‘descriptive’ data analysis (results of sample only) • Key: joint influence of • 1) size of sample • 2) strength of data pattern in increasing confidence about generalisations
Statistical inference ..causes confusion; one of hardest parts of survey data analysis to understand.. Phrases: ‘significance level’ ‘p-value’, ‘confidence interval’, ‘hypothesis testing’, .. Meaning:Whether results would probably generalise to a larger population (if sample is treated as random) See: Refs for L11 part 1 (supplementary list)
Critiques of survey generalisation 1) Part of the ‘fall of survey methods’ 1960’s: • Sampling is not representative Sampling is systematically biassed • Inferential conclusions too carelessly made and too strongly stated • See for example Cicourel 1964
Critiques of survey generalisation 2) Deconstructing inference (1980’s ) • Inferential methods over-relied upon Survey analysis becomes theory-free hunt for ‘significant’ patterns • Inference needed less than often suggested • Bad variable analysis (operationalisation) effects inference results, eg (non-)parametric variables; data clustering; … • See for example Rose and Sullivan 1996 p192-5
Contemporary survey research Tends to use 2 strategies to address critiques: Large scale, often secondary, rigorous methods or Small scale, primary, claims carefully qualified
Terms in sample survey analysis • Population: all cases of interest • Sampling frame: list of all potential cases • Sample: cases selected for analysis • Sampling method: technique for selecting cases from sampling frame • Sampling fraction: proportion of cases from population selected for sample (n/N)
Part 2: Sampling methods and techniques = Ways of selecting case from population
2.1a Simple Random Sample • A statistical method used to choose cases randomly (eg random numbers) Every case in population has exactly the same chance of being in sample • Most data analysis techniques initially designed for simple random samples
2.1b Systematic Random Sample • Like the SRS, select cases from anywhere in the whole population • An easier selection method : choose every (n)th person for the sample • Danger of ‘periodicity’ if original population order has any structure, bias
Problems with sample methods selecting from whole population • The ‘random’ part means it is always possible to get a population coverage quite different from known structures • If total population is large or dispersed, then coverage of random parts of it is expensive and time consuming: few surveys use random sampling from whole of UK
2.1c Stratified random samples • Modifies random sample to ensure even (or ‘boost’) coverage of population groups • split sampling frame by stratification factors • select random samples within each factor • final sample has correct proportions of each • Example: select 490 M and 510 F • Properties: proportionate sample, correct representations; but more expensive & complex, should use ‘weights’ for analysis
2.1d Multistage cluster samples • i) Select clusters of population at random • ii) Sample randomly within clusters • Eg: clusters = local authorities in UK • With qualifications, may still be treated as ‘random’ for analysis purposes • Big reduction in costs if face-to-face contacts Most widely favoured sampling method in large scale survey collections
Example: Multistage cluster sample • Interest: attitudes of Scottish school pupils • Resources: 400 interviews with pupils
Shetlands 2 Highlands 40 Islands 20 Moray 20 Aberdeen 40 Perth 20 Edinburgh 100 Argyll 24 Borders 10 Glasgow 124
Moray 40 Stirling 60 Edinburgh 150 Glasgow 150
Stirling 60 30 young people at Balfron School and 30 young people at Stirling High
2.1e Longitudinal random samples • Longitudinal = interest in study over time • ‘Panel’ and ‘cohort’samples • recontact an initially random sample • Problems of attrition • Retrospective sample • Rely on recall evidence of random selection • Problems of selective recall
Issues in random sampling • Only as good as underlying sampling frame(a good one may not be available, or not be as good as we think) • Data analysis methods need adapting for stratified / clustered designs • Other survey factorsinteract with sample selection issues, eg poor interviewers may discourage certain cases from response
2.2) Opportunistic sampling • More often in social research, sample design is ‘opportunistic’ (‘purposive’) • Random sampling is expensive • Random sampling is complex • Some purported random samples are actually purposive (understanding of ‘random’)
2.2a Quota sampling • Fill up quota’s of groups of interest • Quota’s can ensure: • overall representation (cf systematic) • broad topic coverage (eg types of voter) • Example: market researchers in street; telephone call centres vetting contacts • Biasses: issues in how a quota ‘fills up’
2.2b Snowball sampling • Also ‘focussed enumeration’ • Technique for contacting cases from populations rare / difficult to access • Ask first obtained contact for suggested further contacts snowball gathers size… • Eg – smaller ethnic minority groups • Problem: social networks are non-random!
2.2c Convenience sampling • Samples whatever cases from population were easiest to reach, eg personal contacts • Often no other sampling strategy involved • Biasses likely in convenience process • Examples: …most social research survey examples are ‘convenience’..!
Random v’s Opportunistic • Random difficult and expensive – mainly government funded resources • Most people who have conducted a survey have conducted an opportunistic one • Much data analysis / inference assumes random sample, but not applied to • But opportunistic data is often robust…
More on sampling methods • Refs for sampling methods / properties: eg Gilbert 2001 chpt5; De Vaus 2001 chpt6; Bryman 2001 chpt4. • Research reports: most important is documentation of sampling process / issues • To be open about research • To consider unintentional mistakes
Summary on sampling methods • Good sampling not a panacea • Other elements of surveying equally crucial • Many statistical methods assume random sample • For good sampling, use secondary data.. • All samples have some value, but non-random ones need careful context