1 / 28

Jane Hocking 1 John B. Carlin 2 1 Public Health Training Scheme, North Western Health, Victoria.

Design of Cross-sectional Surveys using Cluster Sampling: an Overview with Australian Case-studies. Jane Hocking 1 John B. Carlin 2 1 Public Health Training Scheme, North Western Health, Victoria.

Download Presentation

Jane Hocking 1 John B. Carlin 2 1 Public Health Training Scheme, North Western Health, Victoria.

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Design of Cross-sectional Surveys using Cluster Sampling: an Overview with Australian Case-studies. Jane Hocking 1 John B. Carlin2 1Public Health Training Scheme, North Western Health, Victoria. 2Clinical Epidemiology and Biostatistics Unit, Royal Children’s Hospital Research Institute and University of Melbourne Department of Paediatrics, Victoria.

  2. Objectives • to outline cluster sampling • to discuss the concepts of design effect and intracluster correlation and their role in sample size estimation for cluster based surveys • to analyse a large Melbourne school based survey

  3. Background • cross-sectional surveys are important in epidemiological research • surveys based on simple random samples are easy to analyse, BUT • often not feasible - problems with obtaining a sampling frame that lists all individuals in the population • not cost effective to perform

  4. Cluster Sampling • identify a sampling frame of clusters of individuals - eg: school classes, local government areas, community health centres • 2 stage - sampling OR multistage - sampling • can be selected by: • simple random sampling , OR • sampling with probability proportional to size (PPS) • BUT - the exact size of the clusters is often unknown at the time of sampling - need weighted analysis

  5. Sample Size Requirements • balance between precision and cost • individuals within a cluster tend to be more alike than those in different clusters resulting in larger standard errors for estimates • loss of precision must be anticipated at the design stage by increasing the sample size

  6. Design Effect (deff) • of use when considering the precision of cluster sample surveys • measures the performance of a particular sampling method against that of a simple random sample • deff may be estimated for measures of association as well as for measures of prevalence (or mean values)

  7. Intracluster Correlation () • deff depends on the size of the clusters and the strength of correlation within clusters •  provides a measure of the degree of homogeneity amongst cluster subjects for the particular outcome under investigation

  8. Sample Size • use standard sample size estimation methods to obtain suitable sample size under simple random sampling and then scale this value up using an estimated deff • need the average cluster size and  - which is a feature of the population under study

  9. Traffic Exposure Survey • walking and cycling activity of children 6-9 years was surveyed in Melbourne • 2 stage random cluster sampling design • 72 schools sampled, 3104 students • outcomes - • 1) prevalence measures -proportion of children walking to school, the number of streets crossed and select socio-economic variables; • 2) measures of association - various factors with walking to school

  10. Analysis • using Stata - has a family of commands designed for data from survey samples - allow for valid adjustment of clustering, stratification and sampling weights • Stata provides a direct estimate of deff for each outcome 1. Siddiqui, Hedeker, et al. Intracluster correlation estimates in a school based smoking prevention study. Am J Epidemiol. 1996;144:425-433

  11. Results -Main Outcome Prevalence Measures

  12. Results - Socio-economic Factors

  13. Results - Measures of Association

  14. Conclusion • cluster sampling methodology becoming more common • consideration of sample size requirements and subsequent analysis is needed • required sample size is dependent on the purpose of the survey - i.e.: prevalence vs association • results need to be published

  15. Design of Cross-sectional Surveys using Cluster Sampling: an Overview with Australian Case-studies. Jane Hocking 1 John B. Carlin2 1Public Health Training Scheme, North Western Health, Victoria. 2Clinical Epidemiology and Biostatistics Unit, Royal Children’s Hospital Research Institute and University of Melbourne Department of Paediatrics, Victoria.

  16. Objectives • to outline cluster sampling • to discuss the concepts of design effect and intracluster correlation and their role in sample size estimation for cluster based surveys • to analyse a large Melbourne school based survey

  17. Background • cross-sectional surveys are important in epidemiological research • surveys based on simple random samples are easy to analyse, BUT • often not feasible - problems with obtaining a sampling frame that lists all individuals in the population • not cost effective to perform

  18. Cluster Sampling • identify a sampling frame of clusters of individuals - eg: school classes, local government areas, community health centres • 2 stage - sampling OR multistage - sampling • can be selected by: • simple random sampling , OR • sampling with probability proportional to size (PPS) • BUT - the exact size of the clusters is often unknown at the time of sampling - need weighted analysis

  19. Sample Size Requirements • balance between precision and cost • individuals within a cluster tend to be more alike than those in different clusters resulting in larger standard errors for estimates • loss of precision must be anticipated at the design stage by increasing the sample size

  20. Design Effect (deff) • of use when considering the precision of cluster sample surveys • measures the performance of a particular sampling method against that of a simple random sample • deff may be estimated for measures of association as well as for measures of prevalence (or mean values)

  21. Intracluster Correlation () • deff depends on the size of the clusters and the strength of correlation within clusters •  provides a measure of the degree of homogeneity amongst cluster subjects for the particular outcome under investigation

  22. Sample Size • use standard sample size estimation methods to obtain suitable sample size under simple random sampling and then scale this value up using an estimated deff • need the average cluster size and  - which is a feature of the population under study

  23. Traffic Exposure Survey • walking and cycling activity of children 6-9 years was surveyed in Melbourne • 2 stage random cluster sampling design • 72 schools sampled, 3104 students • outcomes - • 1) prevalence measures -proportion of children walking to school, the number of streets crossed and select socio-economic variables; • 2) measures of association - various factors with walking to school

  24. Analysis • using Stata - has a family of commands designed for data from survey samples - allow for valid adjustment of clustering, stratification and sampling weights • Stata provides a direct estimate of deff for each outcome 1. Siddiqui, Hedeker, et al. Intracluster correlation estimates in a school based smoking prevention study. Am J Epidemiol. 1996;144:425-433

  25. Results -Main Outcome Prevalence Measures

  26. Results - Socio-economic Factors

  27. Results - Measures of Association

  28. Conclusion • cluster sampling methodology becoming more common • consideration of sample size requirements and subsequent analysis is needed • required sample size is dependent on the purpose of the survey - i.e.: prevalence vs association • results need to be published

More Related