1 / 99

Types of Surveys

Types of Surveys. Cross-sectional surveys a specific population at a given point in time will have one or more of the design components stratification clustering with multistage sampling unequal probabilities of selection Longitudinal

tania
Download Presentation

Types of Surveys

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Types of Surveys Cross-sectional • surveys a specific population at a given point in time • will have one or more of the design components • stratification • clustering with multistage sampling • unequal probabilities of selection Longitudinal • surveys a specific population repeatedly over a period of time • panel • rotating samples

  2. Cross Sectional Surveys Sampling Design Terminology

  3. Methods of Sample Selection Basic methods • simple random sampling • systematic sampling • unequal probability sampling • stratified random sampling • cluster sampling • two-stage sampling

  4. Simple Random Sampling Why? • basic building block of sampling • sample from a homogeneous group of units How? • physically make draws at random of the units under study • computer selection methods: R, Stata

  5. Systematic Sampling Why? • easy • can be very efficient depending on the structure of the population How? • get a random start in the population • sample every kth unit for some chosen number k

  6. Additional Note Simplifying assumption: • in terms of estimation a systematic sample is often treated as a simple random sample Key assumption: • the order of the units is unrelated to the measurements taken on them

  7. Unequal Probability Sampling Why? • may want to give greater or lesser weight to certain population units • two-stage sampling with probability proportional to size at the first stage and equal sample sizes at the second stage provides a self-weighting design (all units have the same chance of inclusion in the sample) How? • with replacement • without replacement

  8. With or Without Replacement? • in practice sampling is usually done without replacement • the formula for the variance based on without replacement sampling is difficult to use • the formula for with replacement sampling at the first stage is often used as an approximation Assumption: the population size is large and the sample size is small – sampling fraction is less than 10%

  9. Stratified Random Sampling Why? • for administrative convenience • to improve efficiency • estimates may be required for each stratum How? • independent simple random samples are chosen within each stratum

  10. Example: Survey of Youth in Custody • first U.S. survey of youths confined to long-term, state-operated institutions • complemented existing Children in Custody censuses. • companion survey to the Surveys of State Prisons • the data contain information on criminal histories, family situations, drug and alcohol use, and peer group activities • survey carried out in 1989 using stratified systematic sampling

  11. SYC Design strata • type (a) groups of smaller institutions • type (b) individual larger institutions sampling units • strata type (a) • first stage – institution by probability proportional to size of the institution • second stage – individual youths in custody • strata type (b) • individual youths in custody • individuals chosen by systematic random sampling

  12. Cluster Sampling Why? • convenience and cost • the frame or list of population units may be defined only for the clusters and not the units How? • take a simple random sample of clusters and measure all units in the cluster

  13. Two-Stage Sampling Why? • cost and convenience • lack of a complete frame How? • take either a simple random sample or an unequal probability sample of primary units and then within a primary take a simple random sample of secondary units

  14. Synthesis to a Complex Design Stratified two-stage cluster sampling Strata • geographical areas First stage units • smaller areas within the larger areas Second stage units • households Clusters • all individuals in the household

  15. Why a Complex Design? • better cover of the entire region of interest (stratification) • efficient for interviewing: less travel, less costly Problem: estimation and analysis are more complex

  16. Ontario Health Survey • carried out in 1990 • health status of the population was measured • data were collected relating to the risk factors associated with major causes of morbidity and mortality in Ontario • survey of 61,239 persons was carried out in a stratified two-stage cluster sample by Statistics Canada

  17. OHSSample Selection • strata: public health units – divided into rural and urban strata • first stage: enumeration areas defined by the 1986 Census of Canada and selected by pps • second stage: dwellings selected by SRS • cluster: all persons in the dwelling

  18. Longitudinal Surveys Sampling Design

  19. Schematic Representation

  20. Schematic Representation

  21. British Household Panel Survey Objectives of the survey • to further understanding of social and economic change at the individual and household level in Britain • to identify, model and forecast such changes, their causes and consequences in relation to a range of socio-economic variables.

  22. BHPS: Target Population and Frame Target population • private households in Great Britain Survey frame • small users Postcode Address File (PAF)

  23. BHPS: Panel Sample • designed as an annual survey of each adult (16+) member of a nationally representative sample • 5,000 households approximately • 10,000 individual interviews approximately. • the same individuals are re-interviewed in successive waves • if individuals split off from original households, all adult members of their new households are also interviewed. • children are interviewed once they reach the age of 16 • 13 waves of the survey from 1991 to 2004

  24. BHPS: Sampling Design Uses implicit stratification embedded in two-stage sampling • postcode sector ordered by region • within a region postcode sector ordered by socio-economic group as determined from census data and then divided into four or five strata Sample selection • systematic sampling of postcode sectors from ordered list • systematic sampling of delivery points (≈ addresses or households)

  25. BHPS: Schema for Sampling

  26. Survey Weights

  27. Survey Weights: Definitions initial weight • equal to the inverse of the inclusion probability of the unit final weight • initial weight adjusted for nonresponse, poststratification and/or benchmarking • interpreted as the number of units in the population that the sample unit represents

  28. Interpretation Interpretation • the survey weight for a particular sample unit is the number of units in the population that the unit represents

  29. Effect of the Weights • Example: age distribution, Survey of Youth in Custody

  30. Unweighted Histogram

  31. Weighted Histogram

  32. Weighted versus Unweighted

  33. Observations • the histograms are similar but significantly different • the design probably utilized approximate proportional allocation • the distribution of ages in the unweighted case tends to be shifted to the right when compared to the weighted case • older ages are over-represented in the dataset

  34. Survey Data Analysis Issues and Simple Examples from Graphical Methods

  35. Basic Problem in Survey Data Analysis

  36. Issues iid (independent and identical distribution) assumption • the assumption does not not hold in complex surveys because of correlations induced by the sampling design or because of the population structure • blindly applying standard programs to the analysis can lead to incorrect results

  37. Example: Rank Correlation Coefficient Pay equity survey dispute: Canada Post and PSAC • two job evaluations on the same set of people (and same set of information) carried out in 1987 and 1993 • rank correlation between the two sets of job values obtained through the evaluations was 0.539 • assumption to obtain a valid estimate of correlation: pairs of observations are iid

  38. Scatterplot of Evaluations • Rank correlation is 0.539

  39. A Stratified Design with Distinct Differences Between Strata • the pay level increases with each pay category (four in number) • the job value also generally increases with each pay category • therefore the observations are not iid

  40. Scatterplot by Pay Category

  41. Correlations within Level Correlations within each pay level • Level 2: –0.293 • Level 3: –0.010 • Level 4: 0.317 • Level 5: 0.496 Only Level 4 is significantly different from 0

  42. Graphical Displays first rule of data analysis • always try to plot the data to get some initial insights into the analysis common tools • histograms • bar graphs • scatterplots

  43. Histograms unweighted • height of the bar in the ith class is proportional to the number in the class weighted • height of the bar in the ith class is proportional to the sum of the weights in the class

  44. Body Mass Index measured by • weight in kilograms divided by square of height in meters • 7.0 < BMI < 45.0 • BMI < 20: health problems such as eating disorders • BMI > 27: health problems such as hypertension and coronary heart disease

  45. BMI: Women

  46. BMI: Men

  47. BMI: Comparisons

  48. Bar Graphs Same principle as histograms unweighted • size of the ith bar is proportional to the number in the class weighted • size of the ith bar is proportional to the sum of the weights in the class

  49. Ontario Health Survey

More Related