310 likes | 419 Views
Detecting Spatial Clustering in Matched Case-Control Studies. Andrea Cook, MS Collaboration with: Dr. Yi Li November 4, 2004. Outline. Motivation Petrochemical exposure in relation to childhood brain and leukemia cancers Cumulative Geographic Residuals Unconditional Conditional
E N D
Detecting Spatial Clustering in Matched Case-Control Studies Andrea Cook, MS Collaboration with: Dr. Yi Li November 4, 2004
Outline • Motivation • Petrochemical exposure in relation to childhood brain and leukemia cancers • Cumulative Geographic Residuals • Unconditional • Conditional • Simulation Results • Type I error • Power Calculations • Application • Childhood Leukemia • Childhood Brain Cancer • Software • Discussion • Limitations • Future Research
Taiwan Petrochemical Study Matched Case-Control Study • 3 controls per case • Matched on Age and Gender • Resided in one of 26 of the overall 38 administrative districts of Kaohsiung County, Taiwan • Controls selected using national identity numbers (not dependent on location).
Study Population Due to dropout approximately 50% 3 to 1 matching, 40% 2 to 1 matching, and 10% 1 to 1 matching.
Cumulative Residuals • Unconditional (Independence) • Model definition using logistic regression • Extension to Cluster Detection • Conditional (Matched Design) • Model definition using conditional logistic regression • Extension to Cluster Detection
Logistic Model Assume the logistic model where, and the link function, Therefore the likelihood score function for is with information matrix
Residual Formulation Then define a residual as, where is the solution to . Assuming the model is correctly specified would imply there is no pattern in residuals. => Use Residuals to test for misspecification. Cumulative Residuals for Model Checking; Lin, Wei, Ying 2002
Hypothesis Test Hypothesis of interest, Geographic Location, (ri, ti ) Independent of Outcome, Yi|Xi Cumulative Geographic Residual Moving Block Process is Patternless
Unconditional Cluster Detection Define the Cumulative Geographic Residual Moving Block Process as,
Asymptotic Distribution However, the asymptotic distribution of is difficult to simulate, but it has been shown to be equivalent to the following, conditional on the observed data, distribution, where
Significance Test Testing the NULL • Simulate N realizations of by repeatedly simulating , while fixing the data at their observed values. • Calculate P-value
Conditional Logistic Model Type of Matching: 1 case to Ms controls Data Structure: Assume that conditional on , an unobserved stratum-specific intercept, and given the logit link, implies, The conditional likelihood, conditioning on is,
Score and Information Denote the conditional likelihood score as, with information matrix,
Conditional Residual Then define a residual as, where is the solution to . => Use these correlated Residuals to test for patterns based on location.
Conditional Cumulative Residual Define the Conditional Cumulative Residual Moving Block Process as, Which has been shown to be asymptotically equivalent to, where and that are independent of observed data.
Testing the NULL Simulate N realizations of by repeatedly simulating , while fixing the data at their observed values. Calculate P-value Significance Test
Simulation • Choice of Gi or Gis • Unconditional Normal Discrete • Conditional Normal Discrete 1 to 1 2 to 1 3 to 1 • Type I error • Power Calculations
Type I error • Unconditional • Generate N xi and yi from Unif(0,10) • Type I error is the percentage of found significant clusters. • Conditional • Generate N xis and yis from Unif(0,10) • Type I error is the percentage of found significant clusters.
Type I error Unconditional Conditional
Power Calculations • Two Power Calculations
Power Calculations • Single Hotspot
Power Calculations • Multiple Hotspots
Power Calculations • Unconditional • Conditional
Application • Study: Kaohsiung, Taiwan Matched Case-Control Study • Method: Conditional Cumulative Geographic Residual Test (Normal and Mixed Discrete)
Results Odds Ratio (p-values) Marginally Significant Clustering for both outcomes without adjusting for smoking history.
Software • R macro to handle both unconditional and conditional data • Dataset: • X and Y coordinates of each participant • Case/control variable • Covariate matrix • Stratum Variable for conditional data • Takes just a few minutes to run!
Discussion Cumulative Geographic Residuals • Unconditional and Conditional Methods for Binary Outcomes • Can find multiple significant hotspots holding type I error at appropriate levels. • Not computer intensive compared to other cluster detection methods Taiwan Study • Found a possible relationship between Childhood Leukemia and Petrochemical Exposure, but not with the outcome Childhood Brain Cancer.
Discussion Future Research • Failure Time Data • Recurrent Events • Relocation of Study Participants • Surveillance