240 likes | 253 Views
This article discusses the use of statistical methods for detecting sudden disease outbreaks and the challenges of detecting outbreaks in non-specified geographical areas. The example of thyroid cancer incidence in New Mexico is used to illustrate the application of these methods. The article also explains the use of space-time scan statistics and likelihood ratio tests for detecting emerging clusters of diseases.
E N D
Early Detection of Disease OutbreaksProspective Surveillance
For a pre-specified geographical area, there are existing purely temporal statistical methods for the detection of a sudden disease outbreak. Two Important Issues Such methods can be used simultaneously for multiple geographical areas, but that leads to multiple testing, providing more false alarms than what is reflected in the nominal significance level. Disease outbreaks may not conform to the pre-specified geographical areas.
Example:Thyroid Cancer Incidence in New Mexico Data Source: New Mexico Tumor Registry Time Period: 1973-1992 Gender: Male Population: 580,000 Annual Incidence Rate: 2.8/100,000 Aggregation Level: 32 Counties Adjustments for: Age and Temporal Trends Monte Carlo Replications: 999
Example: Thyroid Cancer • Median age at diagnosis: 44 years • United States (SEER) incidence: 4.5 / 100,000 • United States mortality: 0.3 / 100,000 • Five year survival: 95% • Known risk factors: • Radiation treatment for head and neck conditions. • Radioactive downfall (Hiroshima/Nagasaki, Chernobyl, Marshall Islands) • Work as radiologic technician (USA) or x-ray operator (Sweden).
Detecting Emerging Clusters • Instead of a circular window in two dimensions, we use a cylindrical window in three dimensions. • The base of the cylinder represents space, while the height represents time. • The cylinder is flexible in its circular base and starting date, but we only consider those cylinders that reach all the way to the end of the study period. Hence, we are only considering ‘alive’ clusters.
Hypothesis Test • Find Likelihood for Each Choice of Cylinder • Through Maximum Likelihood Estimation, Find the Most Likely Cluster • Apply Likelihood Ratio Test • Evaluate Significance Through Monte Carol Simulation
Cluster Period Cases Expected Space-Time Scan Statistic Alive Clusters Years Most Likely Cluster RR p= 73-78 Bernadillo + 7 counties West 75-78 48 36 1.4 0.60 73-79 LosAlamos, Rio Arriba 75-79 93.3 2.7 0.58 73-80 LosAlamos, Rio Arriba 75-80 10 3.82.6 0.54 73-81 North Central – SanMiguel 75-81 72 53 1.4 0.19 73-82 North Central – SanMiguel 75-82 85 62 1.4 0.08 73-83 Bernadillo, Valencia 73-83 8462 1.40.13 73-84 North Central 73-84 113 90 1.3 0.14 73-85 Lincoln 85 3 0.2 13.8 0.23 73-86 North Central + Colfax, Harding 73-86 129 108 1.2 0.49 73-87 North Central + Colfax, Harding 73-87 142117 1.2 0.21 73-88 North Central – SanMiguel 73-88 143 115 1.2 0.08 73-89 North Central + Colfax,Harding 73-89 165 134 1.2 0.06 North Central Counties = Bernadillo, Los Alamos, Mora, Rio Arriba, Sandoval, San Miguel, Santa Fe and Taos.
Cluster Period Cases Expected Space-Time Scan Statistic Alive Clusters RR p= Years Most Likely Cluster 73-78 Bernadillo + 7 counties West 75-78 48 36 1.4 0.60 73-79 LosAlamos, Rio Arriba 75-79 93.3 2.7 0.58 73-80 LosAlamos, Rio Arriba 75-80 10 3.82.6 0.54 73-81 North Central – SanMiguel 75-81 72 53 1.4 0.19 73-82 North Central – SanMiguel 75-82 85 62 1.4 0.08 73-83 Bernadillo, Valencia 73-83 8462 1.40.13 73-84 North Central 73-84 113 90 1.3 0.14 73-85 Lincoln 85 3 0.2 13.8 0.23 73-86 North Central + Colfax, Harding 73-86 129 108 1.2 0.49 73-87 North Central + Colfax, Harding 73-87 142117 1.2 0.21 73-88 North Central – SanMiguel 73-88 143 115 1.2 0.08 73-89 North Central + Colfax,Harding 73-89 165 134 1.2 0.06 73-90 LosAlamos, RioArriba, 79-90 41 22 1.8 0.06 SantaFe, Taos 73-91 LosAlamos 89-91 7 0.9 7.6 0.02 North Central Counties = Bernadillo, Los Alamos, Mora, Rio Arriba, Sandoval, San Miguel, Santa Fe and Taos.
Cluster Period Cases Expected Space-Time Scan Statistic Alive Clusters RR p= Years Most Likely Cluster 73-78 Bernadillo + 7 counties West 75-78 48 36 1.4 0.60 73-79 LosAlamos, Rio Arriba 75-79 93.3 2.7 0.58 73-80 LosAlamos, Rio Arriba 75-80 10 3.82.6 0.54 73-81 North Central – SanMiguel 75-81 72 53 1.4 0.19 73-82 North Central – SanMiguel 75-82 85 62 1.4 0.08 73-83 Bernadillo, Valencia 73-83 8462 1.40.13 73-84 North Central 73-84 113 90 1.3 0.14 73-85 Lincoln 85 3 0.2 13.8 0.23 73-86 North Central + Colfax, Harding 73-86 129 108 1.2 0.49 73-87 North Central + Colfax, Harding 73-87 142117 1.2 0.21 73-88 North Central – SanMiguel 73-88 143 115 1.2 0.08 73-89 North Central + Colfax,Harding 73-89 165 134 1.2 0.06 73-90 LosAlamos, RioArriba, 79-90 41 22 1.8 0.06 SantaFe, Taos 73-91 LosAlamos 89-91 7 0.9 7.6 0.02 73-92 LosAlamos 89-92 9 1.2 7.4 0.002 North Central Counties = Bernadillo, Los Alamos, Mora, Rio Arriba, Sandoval, San Miguel, Santa Fe and Taos.
Adjusting for Yearly SurveillanceThe Los Alamos Cluster 1991 Analysis: p=0.13 (unadjusted p=0.02) 1992 Analysis: p=0.016 (unadjusted p=0.002)
Los Alamos cases
Thyroid Cancer in Los Alamos • The New Mexico Department of Health have investigated the individual nature of all 17 male thyroid cancer cases reported in Los Alamos 1970-1995. All were confirmed cases.
Thyroid Cancer in Los Alamos • 3/17 had a history of therapeutic ionizing radiation treatment to the head and neck. • 8/17 had been regularly monitored for exposure to ionizing radiation due to their particular work at the Los Alamos National Laboratory. • 2/17 had had significant workplace-related exposure to ionizing radiation from atmospheric weapons testing fieldwork. A know risk factor, ionizing radiation, is hence a likely explanation for the observed cluster.
Practical Considerations • Chronic or infectious diseases. • Known or unknown etiology. • Daily, weekly, monthly, or yearly data, depending on the type of disease. • It is not possible to detect clusters much smaller than the level of data aggregation. • Data quality control. • Help prioritize areas for deeper investigation. • P-values should be used as a general guideline, rather than in a strict sense.
Limitations • Space-time clusters may occur for other reasons than disease outbreaks • Automated detection systems does not replace the observant eyes of physicians and other health workers. • Epidemiological investigations by public health department are needed to confirm or dismiss the signals.
Conclusions • The space-time scan statistic can serve as an important tool in prospective systematic time-periodic geographical surveillance for the early detection of disease outbreaks. • It is possible to detect emerging clusters, and we can adjust for the multiple tests performed over the years. • The method can be used for different diseases.
Thyroid Cancer in Los Alamos • The New Mexico Department of Health have investigated the individual nature of all 17 male thyroid cancer cases reported in Los Alamos 1970-1995. All were confirmed cases.
Thyroid Cancer in Los Alamos • 3/17 had a history of therapeutic ionizing radiation treatment to the head and neck. • 8/17 had been regularly monitored for exposure to ionizing radiation due to their particular work at the Los Alamos National Laboratory. • 2/17 had had significant workplace-related exposure to ionizing radiation from atmospheric weapons testing fieldwork. A know risk factor, ionizing radiation, is hence a likely explanation for the observed cluster.
Practical Considerations • Chronic or infectious diseases. • Known or unknown etiology. • Daily, weekly, monthly, or yearly data, depending on the type of disease. • It is not possible to detect clusters much smaller than the level of data aggregation. • Data quality control. • Help prioritize areas for deeper investigation. • P-values should be used as a general guideline, rather than in a strict sense.
Practical Considerations (cont.) • Possible to specify 0.05 probability of a false alarm: - since start - during last 20 years - during last 5 years ( ~ one false alarm per 100 years) - during last year ( ~ one false alarm per 20 years) - during last 18 days (~ one false alarm per year)
Conclusions • The space-time scan statistic can serve as an important tool in systematic time-periodic geographical disease surveillance. • It is possible to detect emerging clusters, and we can adjust for the multiple tests performed over the years. • The method can be used for different diseases.
Computing Time Each analysis took between 5 and 75 seconds to run on a 400 MHz Pentium Pro.
References Kulldorff M. Prospective time-periodic geographical disease surveillance using a scan statistic. Journal of the Royal Statistical Society, A164:61-72, 2001. Software: Kulldorff M et al. SaTScan v.3.1. http://www.satscan.org/