160 likes | 287 Views
Developing a dynamic sampling algorithm for cohort studies. M.H.P. Hof A.C.J. Ravelli M.B. Snijder K. Stronks A.H. Zwinderman. Setting. Increasing number of non-Dutch inhabitants Welfare, health, and illness varies between different ethnic groups Why?
E N D
Developing a dynamic sampling algorithm for cohort studies M.H.P. Hof A.C.J. Ravelli M.B. Snijder K. Stronks A.H. Zwinderman
Setting • Increasing number of non-Dutch inhabitants • Welfare, health, and illness varies between different ethnic groups • Why? • Unclear whether current healthcare and treatment (mainly based on the Dutch Caucasian population) guidelines can be used Source: O+S Amsterdam blabla
Setting • HELIUS(HEalthy Life in an Urban Setting) Study • Large multi-ethnic cohort study among • Moroccan, • Surinamese (-Creole and –Hindustani) • Turkish, • West-African • Dutch/Caucasian • Group size ± 10,000 individuals • Participants will undergo extensive interviews, medical investigations, and biomaterial will be collected. • Recruitment period: ± 1 year
Problem Definition • High generalizability • Representativeness • Sample size • Recruitment period of great importance • Sampling Design
Current Sampling Designs • (Restricted) randomized sampling • Double stage sampling • Stage 1: Sample a large group and obtain distributions of characteristics • Stage 2: Use stratified randomization with stage 1 results
Current Sampling Designs • Problems: • Expensive • Non-response differences in subgroups undetected • Limited number of strata possible • Results are very depended on pre-assumptions
Stepwise Sampling Algorithm • Development of stepwise sampling algorithm • Actively invite participants with certain characteristics • Minimize difference population and sample • HELIUS study focusses on representativeness on 4 categorized variables • Known for each individual • x1 = Age (4 categories) • x2 = Gender (2 categories) • Unknown for each individual • x3 = Household situation (7 categories) • x4 = Income (5 categories)
Stepwise Sampling Algorithm • Problems of active selection • Joint distribution of population composition f(x1, x2, x3, x4)unavailable • Estimation of population composition • Prior knowledge: f(x1 * x2) f(x3) * f(x4) • Without Prior knowledge • Updated with sample composition f(x1, x2, x3, x4) • Individuals could only be selected on x1 and x2 • x1 = Age • x2 = Gender • x3 = Household situation • x4 = Income
Stepwise Sampling Algorithm • Recruitment period has n iterations • Each iteration: • Individuals were invited with optimal characteristics f(x1 , x2) and estimated f(x3) and f(x4) • Minimizing differences between sample- and estimated population-composition • Weighted for response and participation chance • Population Estimation was updated with f(x1, x2, x3, x4)from the sample • x1 = Age • x2 = Gender • x3 = Household situation • x4 = Income
Stepwise Sampling Algorithm • Hypothesis: Random Sampling Stepwise Sampling
Simulation Setting • Stepwise Sampling Algorithm versus Random sampling (With prior knowledge) (Without prior knowledge) • Recruitment period consists of 50 iterations and a sample size of 10,000 per ethnic group is desired • Population • O+S Research and Statistics Amsterdam Data from 2009 • Five ethnic groups • Dutch (Largest) • Surinamese • Moroccan • Turkish • Antillean (Smallest) . • Response rates varying between all characteristics • Invited persons responded and participated one iteration later • Non-responders were sent a reminder once • Performance measured by • Representativeness and Compared to Sample Size
Discussion • Stepwise Sampling Algorithm • Strengths • Non-response adjustment • Better representativeness and sample size • Large number of characteristics representative • Less depended on prior knowledge • Weakness • High burden of registration during recruitment • No increase in representativeness of individually unknown characteristics
Conclusion • The Stepwise Sampling Algorithm outperforms Random Sampling on representativeness