260 likes | 270 Views
This research paper assesses the performance of statistical methods for syndromic surveillance using real data from emergency rooms. It compares univariate and multivariate detection algorithms and evaluates their effectiveness in detecting outbreaks.
E N D
Statistical Methods for Syndromic Surveillance: An Empirical Assessment Ron Fricker, Mike Stoto, Arvind Jain November 20, 2004
The Problem: Assessing Statistical Methods for Syndromic Surveillance • Syndromic surveillance: “…surveillance using health-related data that precede diagnosis and signal a sufficient probability of a case or an outbreak to warrant further public health response.” [1] • Research goal: develop and characterize performance of univariate and multivariate detection algorithms • Many health departments and CDC already have some type of syndromic surveillance system in place • CDC’s BioSense program active as of April 2004 [2] [1] CDC (www.cdc.gov/epo/dphsi/syndromic.htm) [2] www.cdc.gov/phin/Webinars/BioSense.htm
Idea of Syndromic Surveillance Source: Michael Wagner, University of Pittsburgh
An Empirical Assessment Using Real Data • Three years of emergency room data (980 days) • Data from 7 DC hospital ERs from 9-11-01 to present • Focus on 4 chief complaints recorded by ER: -- Gastro-intestinal -- Rash -- Respiratory -- Unspecified infection • Restrict data to non-flu season (527 days) • Data generally low to no correlation Non-WinterWinter Syndromes w/in hospitals: 0.0 < r < 0.28 0.0 < r < 0.29 By syndrome between hospitals: 0.0 < r < 0.29 0.0 < r < 0.52
Statistical Process Control (SPC) for Syndromic Surveillance • In manufacturing setting, SPC used to monitor production and test for a change level of quality • Have parameter(s) of quality characteristic shifted? • In syndromic surveillance, goal is to monitor whether a pathogen has been released • Has distribution of leading indicators has shifted in some meaningful (i.e., worrisome) way? • For syndromic surveillance SPC method(s) must be robust because manifestation of the threat may vary
Basic Univariate Statistical Process Control (SPC) Methods • Shewhart (1931) • Stop when observation exceeds pre-defined threshold • Better for detecting large shifts/changes • CUSUM (Page, 1954) • Stop when cumulative sum of observations exceeds threshold • Recursion is • Better for detecting small shifts/changes
Picturing the Shewhart flag threshold observed values (Xi) time
Picturing the CUSUM flag CUSUM threshold CUSUM (Si) Shewhart threshold observed values (Xi) time
Multivariate Statistical Process Control (SPC) Methods • Hotelling’s T2 • Stop when distance to observation exceeds threshold • Like Shewhart, good at detecting large shifts • Crosier’s MCUSUM (1988) • Cumulates vectors componentwise • As with CUSUM, good at detecting small shifts • We modified both methods only look for positive changes
Si* Si* Picturing Modified MCUSUM (2 dims) (b) Si* Si k /Ci% Si* Si* k /Ci% Si (a) flag Ci Si Si Yi (c) Si* (1) Time i, get new obs, add to Si-1 (Si*) (2) Calculate Si according to whether: (a) Ci<k (b) Ci>k (c) Need to bound by 0 (3) Test if Yi>h k h
Empirical Approach: Simulate an Attack 15 10 Number of cases 5 0 0 10 20 30 40 Day • Seed: 1, 2, … 10 extra cases over 10 days • Cycle through data, starting attack on each day • Aggregate results to estimate performance
Attack Scenarios • Scenario A: Add seed to all series • 7 hospitals x 4 chief complaints • Scenario B: Add seed to “unspecified infection” only across all hospitals • Scenario BA: Add 4 x seed to “unspecified infection” only (i.e., seed: 4, 8, … 40) • Scenario C: Add seed to all syndromes in one mid-sized hospital • Scenario CA: Add 7 x seed to one mid-sized hospital only (i.e., seed: 7, 14, … 70)
General Performance Summary • No method best in all situations, but CUSUM methods generally better • Multivariate CUSUM slightly preferable to simultaneous univariate CUSUMs overall • Univariate CUSUMs slightly preferable to multivariate CUSUM in Scenario B (most likely?) • Simultaneous Shewharts and T2 generally worse • Other ad hoc methods much less effective • Pooling p-values • Combining data • CUSUM w/ EWMA-estimated mean
MCUSUM Performance Summary • MCUSUM slightly better or equivalent in all but Scenario B (and only slightly worse there) • Exceeds 50% sensitivity on • Day 1 in Scenarios BA and CA • Day 3 in Scenario A • Day 5 in Scenarios B and C • In “flu season” • False positive rate ~50% for simultaneous univariate CUSUMs, ~15% for MCUSUM • Both reach 80% probability of detection at roughly the same time, except for Scenario C
Conclusions • Modified Crosier MCUSUM method looks promising for monitoring general, overall trends in a larger area • Seems to be the most robust to a variety of shifts • Less affected by false alarms in flu season • Simulation assessment consistent with empirical evaluation • Syndromic surveillance using MCUSUM has reasonable probability of detecting outbreak within 3-5 days of start of outbreak
Further Research • Detection algorithms • Different values of k, other parameters • Trading off Type I and Type II error • How large of an outbreak is detectable • Realistic disease patterns • Further analyses of DC ER data • Apply MCUSUM • Monitor flu in 2005
Selected References • Chang, J.T., and R.D. Fricker, Jr., “Detecting When a Monotonically Increasing Mean has Crossed a Threshold,” Journal of Quality Technology, 31, 1999. • Crosier, R.B., “Multivariate Generalizations of Cumulative Sum Quality Control Schemes,” Technometrics, 30, 1988. • Healy, J.D., “A Note on Multivariate CUSUM Procedures,” Technometrics, 29, 1987. • Hotelling, H., “Multivariate Quality Control, Illustrated by the Air Testing of Sample Bombsights,” Techniques of Statistical Analysis, eds. C. Eisenhart, M.W. Hastay, and W.A. Wallis, McGraw-Hill Book Co., pp. 111-184, 1947. • Kuldorff, M., “Prospective Time-periodic Geographical Disease Surveillance Using a Scan Statistic,” J Roy Stat Soc., A164, 2001. • Page, E.S., “Continuous Inspection Schemes,” Biometrika, 44, 1954. • Pignatiello, Jr., J.J.and G.C. Runger, “Comparisons of Multivariate CUSUM Charts,” Journal of Quality Technology, 22, 1990. • Shewhart, W.A., Economic Control of Quality of Manufactured Product, New York, NY:D. Van Nostrand Company, Inc., 1931.