80 likes | 195 Views
RDS Data Analysis and Estimation of Design Effect: an application among FCSW in Brazil. Viena, July 20, 2010 Célia Landmann Szwarcwald celials@cict.fiocruz.br. Proposed Estimation Method.
E N D
RDS Data Analysis and Estimation of Design Effect: an application among FCSW in Brazil Viena, July 20, 2010 Célia Landmann Szwarcwald celials@cict.fiocruz.br
Proposed Estimation Method • Respondent-driven sampling (RDS) is a chain-referral method that is being widely used to recruit most at risk populations. Since the method is respondent-driven, observations are dependent. • In this paper, we propose a method for estimating the variance of the HIV prevalence rate, based on the Markov transition probabilities. • Using statistical procedures appropriate for analysis of data collected by complex sample designs, we considered the homophily effect, the intra-class correlation among participants recruited by the same person, as well as the unequal selection probabilities, resultant from different network sizes.
The FCSW study, Brazil, 2009 • The method was applied to a FCSW study carried out in 10 Brazilian cities in 2008. The total sample size was 2523 women. The study included a behavior questionnaire and rapid tests for HIV and syphilis. • The question used to measure network size was: “ How many CSW that work in the city do you know personally? “ • Additionally, as the study was conducted in 10 cities, to provide results for the total, the sample was calibrated by the relative size of women aged 18-59 years, considering each city as a stratum: where i represents participant, j represents city, δ is the degree and mj the proportion of female population 18-59 years in the city j.
Estimation of HIV Prevalence (p) and Variance of p , where p1.0 and p0.1 are the transition probabilities. • Let p = P(HIV+) the parameter to be estimated. By the Markov equilibrium equation: • Let x=LN (p1.0/p0.1). Then, p can be written as: • Using the delta method, the variance of p is estimated by: , where the variance of x is estimated by: • The var(p0.1) and var(p1.0) are the variances of the conditional probabilities and should be estimated as in cluster sampling to account for intra-class correlation among participants recruited from the same person.
Table 1: HIV participant test results according to the HIV test results of the corresponding recruiter after sampling weighting. FCSW, Brazil, 2009 *Variances estimated taking into account intra-class correlation among participants invited by the same CSW. The probability that a positive recruiter invites a positive participant is 5 times the probability that a negative recruiter invites a positive participant. p=4.8% ; 95%CI (3.4% , 6.1%); DEFF=2.62 and OR=5.8 (p<0.0001)
Legend Large symbols represent the seeds. HIV - HIV + Not HIV tested
Conclusions • The proposed estimation method is equivalent to the logistic regression model: Logit (p (x)) = a + b x, where x=1 if the recruiter is HIV+, 0 otherwise. Therefore, considering the complex sample design, we can use the OR to test for homophily • In the analysis of FCSW, the homophily effect was highly significant, showing the need to consider the dependence among observations in the data analysis. • The large design effect suggests that the execution of an RDS study in only one city would need a very large sample size. • Stratification in cities or neighborhoods was adequate to decrease the design effect and may be adopted in other studies, as long as the strata weights are known.