1 / 26

Statistical Methods for Syndromic Surveillance : An Empirical Assessment

This research paper assesses the performance of statistical methods for syndromic surveillance using real data from emergency rooms. It compares univariate and multivariate detection algorithms and evaluates their effectiveness in detecting outbreaks.

drickard
Download Presentation

Statistical Methods for Syndromic Surveillance : An Empirical Assessment

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Statistical Methods for Syndromic Surveillance: An Empirical Assessment Ron Fricker, Mike Stoto, Arvind Jain November 20, 2004

  2. The Problem: Assessing Statistical Methods for Syndromic Surveillance • Syndromic surveillance: “…surveillance using health-related data that precede diagnosis and signal a sufficient probability of a case or an outbreak to warrant further public health response.” [1] • Research goal: develop and characterize performance of univariate and multivariate detection algorithms • Many health departments and CDC already have some type of syndromic surveillance system in place • CDC’s BioSense program active as of April 2004 [2] [1] CDC (www.cdc.gov/epo/dphsi/syndromic.htm) [2] www.cdc.gov/phin/Webinars/BioSense.htm

  3. Idea of Syndromic Surveillance Source: Michael Wagner, University of Pittsburgh

  4. An Empirical Assessment Using Real Data • Three years of emergency room data (980 days) • Data from 7 DC hospital ERs from 9-11-01 to present • Focus on 4 chief complaints recorded by ER: -- Gastro-intestinal -- Rash -- Respiratory -- Unspecified infection • Restrict data to non-flu season (527 days) • Data generally low to no correlation Non-WinterWinter Syndromes w/in hospitals: 0.0 < r < 0.28 0.0 < r < 0.29 By syndrome between hospitals: 0.0 < r < 0.29 0.0 < r < 0.52

  5. Hard to see any outbreaks through the noise

  6. Statistical Process Control (SPC) for Syndromic Surveillance • In manufacturing setting, SPC used to monitor production and test for a change level of quality • Have parameter(s) of quality characteristic shifted? • In syndromic surveillance, goal is to monitor whether a pathogen has been released • Has distribution of leading indicators has shifted in some meaningful (i.e., worrisome) way? • For syndromic surveillance SPC method(s) must be robust because manifestation of the threat may vary

  7. Basic Univariate Statistical Process Control (SPC) Methods • Shewhart (1931) • Stop when observation exceeds pre-defined threshold • Better for detecting large shifts/changes • CUSUM (Page, 1954) • Stop when cumulative sum of observations exceeds threshold • Recursion is • Better for detecting small shifts/changes

  8. Picturing the Shewhart flag threshold observed values (Xi) time

  9. Picturing the CUSUM flag CUSUM threshold CUSUM (Si) Shewhart threshold observed values (Xi) time

  10. Multivariate Statistical Process Control (SPC) Methods • Hotelling’s T2 • Stop when distance to observation exceeds threshold • Like Shewhart, good at detecting large shifts • Crosier’s MCUSUM (1988) • Cumulates vectors componentwise • As with CUSUM, good at detecting small shifts • We modified both methods only look for positive changes

  11. Picturing the Modified T2 (in 2 dims) flag Y X

  12. Si* Si* Picturing Modified MCUSUM (2 dims) (b) Si* Si k /Ci% Si* Si* k /Ci% Si (a) flag Ci Si Si Yi (c) Si* (1) Time i, get new obs, add to Si-1 (Si*) (2) Calculate Si according to whether: (a) Ci<k (b) Ci>k (c) Need to bound by 0 (3) Test if Yi>h k h

  13. Empirical Approach: Simulate an Attack 15 10 Number of cases 5 0 0 10 20 30 40 Day • Seed: 1, 2, … 10 extra cases over 10 days • Cycle through data, starting attack on each day • Aggregate results to estimate performance

  14. Attack Scenarios • Scenario A: Add seed to all series • 7 hospitals x 4 chief complaints • Scenario B: Add seed to “unspecified infection” only across all hospitals • Scenario BA: Add 4 x seed to “unspecified infection” only (i.e., seed: 4, 8, … 40) • Scenario C: Add seed to all syndromes in one mid-sized hospital • Scenario CA: Add 7 x seed to one mid-sized hospital only (i.e., seed: 7, 14, … 70)

  15. Specific Results

  16. Specific Results

  17. Specific Results

  18. Specific Results

  19. Specific Results

  20. General Performance Summary • No method best in all situations, but CUSUM methods generally better • Multivariate CUSUM slightly preferable to simultaneous univariate CUSUMs overall • Univariate CUSUMs slightly preferable to multivariate CUSUM in Scenario B (most likely?) • Simultaneous Shewharts and T2 generally worse • Other ad hoc methods much less effective • Pooling p-values • Combining data • CUSUM w/ EWMA-estimated mean

  21. MCUSUM Performance Summary • MCUSUM slightly better or equivalent in all but Scenario B (and only slightly worse there) • Exceeds 50% sensitivity on • Day 1 in Scenarios BA and CA • Day 3 in Scenario A • Day 5 in Scenarios B and C • In “flu season” • False positive rate ~50% for simultaneous univariate CUSUMs, ~15% for MCUSUM • Both reach 80% probability of detection at roughly the same time, except for Scenario C

  22. Conclusions • Modified Crosier MCUSUM method looks promising for monitoring general, overall trends in a larger area • Seems to be the most robust to a variety of shifts • Less affected by false alarms in flu season • Simulation assessment consistent with empirical evaluation • Syndromic surveillance using MCUSUM has reasonable probability of detecting outbreak within 3-5 days of start of outbreak

  23. Further Research • Detection algorithms • Different values of k, other parameters • Trading off Type I and Type II error • How large of an outbreak is detectable • Realistic disease patterns • Further analyses of DC ER data • Apply MCUSUM • Monitor flu in 2005

  24. Selected References • Chang, J.T., and R.D. Fricker, Jr., “Detecting When a Monotonically Increasing Mean has Crossed a Threshold,” Journal of Quality Technology, 31, 1999. • Crosier, R.B., “Multivariate Generalizations of Cumulative Sum Quality Control Schemes,” Technometrics, 30, 1988. • Healy, J.D., “A Note on Multivariate CUSUM Procedures,” Technometrics, 29, 1987. • Hotelling, H., “Multivariate Quality Control, Illustrated by the Air Testing of Sample Bombsights,” Techniques of Statistical Analysis, eds. C. Eisenhart, M.W. Hastay, and W.A. Wallis, McGraw-Hill Book Co., pp. 111-184, 1947. • Kuldorff, M., “Prospective Time-periodic Geographical Disease Surveillance Using a Scan Statistic,” J Roy Stat Soc., A164, 2001. • Page, E.S., “Continuous Inspection Schemes,” Biometrika, 44, 1954. • Pignatiello, Jr., J.J.and G.C. Runger, “Comparisons of Multivariate CUSUM Charts,” Journal of Quality Technology, 22, 1990. • Shewhart, W.A., Economic Control of Quality of Manufactured Product, New York, NY:D. Van Nostrand Company, Inc., 1931.

More Related