160 likes | 561 Views
Issues in Analysis of Time Location Sampling (TLS) . Keith Sabin, PhD Strategic Information and Research WHO/HIV Department. Citation. John M. Karon. The Analysis of Time-location Sampling Study Data. ASA Section on Survey Research Methods. (Available for free on internet)
E N D
Issues in Analysis of Time Location Sampling (TLS) Keith Sabin, PhD Strategic Information and Research WHO/HIV Department
Citation John M. Karon. The Analysis of Time-location Sampling Study Data. ASA Section on Survey Research Methods. (Available for free on internet) Young Men’s Survey, Phase II =YMS • Survey of MSM in 6 US cities, 1998-2000 • Four cities’ data used in presented analysis
Statistical Foundation of TLS • Mimics multistage cluster sampling • Venues are Primary Units/Clusters • Individuals are Elementary Units • If enumerations conducted, probability proportional to size possible • Self-weighting for venue attendance • Needs to be reconfirmed at sampling event
Challenge • Developing a valid mechanism to adjust for unequal selection probabilities of individuals
Underlying challenge of TLS Analysis • Produces a probability sample of visits to venues included • Therefore visit is correct unit of analysis • Participants will revisit venues • Multiple chances of selection • However “Visitor” is main interest • Ergo, unequal selection probabilities
To Weight or Not to Weight Weights derived from participants’ self-reported frequency of attendance at venues (not necessarily those sampled)
Analysis weight and % of participants, by reported frequency of attendance at bars and night clubs
Alternative analyses of HIV prevalence, and standard errors of estimates DE applies to clustered, weighted analysis only
Alternative clustered analyses of the prevalence of Hepatitis B and unprotected anal intercourse,and standard errors for these estimates DE applies to clustered, weighted analysis only
Design effects and affecting factors HBV: Hepatitis B virus CVw2: square of the coefficient of variation of the analysis weights. NC: algorithm did not converge. The p-values are from a logistic regression mixed model.
99% confidence intervals for HIV prevalence by venue, city C, conditional on number of men sampled at a venue Observed prevalence Overall prevalence
Unprotected anal intercourse (UAI) last 6 months as risk factor for HIV & Hepatitis B in City C, alternative logistic models
# of men and odds ratios for association between Hepatitis B prevalence & UAI last 6 months, by frequency of attendance at bars & clubs
Conclusions • Unweighted analysis = convenience • Should not be used for size estimates • Sampling fractions for weights = survey of visits • Clustering effects of venues should always be examined
Recommendations • Collect information on frequency persons in the population of interest attend venues in the sampling frame • Proxy: how frequently a person attends each type of venue in the sampling frame • Applies to ALL venues in the frame • Account for design effects in sample size calculations • Get a good statistician!