460 likes | 801 Views
Statistical Methods for Alerting Algorithms in Biosurveillance. Howard S. Burkom The Johns Hopkins University Applied Physics Laboratory National Security Technology Department Washington Statistical Society Seminar February 3, 2006 National Center for Health Statistics Hyattsville, MD.
E N D
Statistical Methods for Alerting Algorithms in Biosurveillance Howard S. Burkom The Johns Hopkins University Applied Physics Laboratory National Security Technology Department Washington Statistical Society Seminar February 3, 2006 National Center for Health Statistics Hyattsville, MD
ESSENCE Biosurveillance Systems • ESSENCE: An Electronic Surveillance System for the Early Notification of Community-based Epidemics • Monitoring health care data from ~800 military treatment facilities since Sept. 2001 • Evaluating data sources • Civilian physician visits • OTC pharmacy sales • Prescription sales • Nurse hotline/EMS data • Absentee rate data • Developing & implementing alerting algorithms
Outline of Talk • Prospective Syndromic Surveillance: introduction, challenges • Algorithm Evaluation Approaches • Statistical Quality Control in Health Surveillance • Data Modeling and Process Control • Regression Modeling Approach • Generalized Exponential Smoothing • Comparison Study • Summary & Research Directions
Required Disciplines: Medical/Epi • Medical/Epidemiological • filtering/classifying clinical records => syndromes • interpretation/response to system output • coding/chief complaint interpretation
Required Disciplines: Informatics • Information Technology • surveillance system architecture • data ingestion/cleaning • interface between health monitors and system
Required Disciplines: Analytics • Analytical • Statistical hypothesis tests • Data mining/automated learning • Adaptation of methodology to background data behavior
Essential Task Interaction in Volatile Data Background • Medical/Epidemiological • filtering/classifying clinical records => syndromes • interpretation/response to system output • coding/chief complaint interpretation • Information Technology • surveillance system architecture • data ingestion/cleaning • interface between health monitors and system • Analytical • Statistical hypothesis tests • Data mining/automated learning • Adaptation of methodology to background data behavior
The Multivariate Temporal Surveillance Problem Varying Nature of the Data: • Scale, trend, day-of-week, seasonal behavior depending on grouping: Multivariate Nature of Problem: • Many locations • Multiple syndromes • Stratification by age, gender, other covariates Surveillance Challenges: • Defining anomalous behavior(s) • Hypothesis tests--both appropriate and timely • Avoiding excessive alerting due to multiple testing • Correlation among data streams • Varying noise backgrounds • Communication with/among users at different levels • Data reduction and visualization
Data issues affecting monitoring Most suitable for modeling without data-specific information • Statistical properties • Scale and random dispersion • Periodic effects • Day-of-week effects, seasonality • Delayed (often variably) availability in monitoring system • Trends: long/short term: many causes, incl. changes in: • Population distribution or demographic composition • Data provider participation • Consumer health care behavior • Coding or billing practices • Prolonged data drop-outs, sometimes with catch-ups • Outliers unrelated to infectious disease levels • Often due to problems in data chain • Inclement weather • Media reports (example: the “Clinton effect”)
Rash Syndrome Grouping of Diagnosis Codes www.bt.cdc.gov/surveillance/syndromedef/word/syndromedefinitions.doc
Chief Complaint Query Simulated Data
Dynamic Detection Dynamic Detection Simulated Data
Threshold Example with Detection Statistic Plot Injected Cases Presumed Attributable to Outbreak Event
Comparing Alerting AlgorithmsCriteria: • Sensitivity • Probability of detecting an outbreak signal • Depends on effect of outbreak in data • Specificity ( 1 – false alert rate ) • Probability(no alert | no outbreak ) • May be difficult to prove no outbreak exists • Timeliness • Once the effects of an outbreak appear in the data, how soon is an alert expected?
Modeling the Signalas Epicurve of Primary Cases • Need “data epicurve”: time series of attributable counts above background • Plausible to assume proportional to epidemic curve of infected • Sartwell lognormal model gives idealized shape for a given disease type Sartwell, PE. The distribution of incubation periods of infectious disease. Am J Hyg 1950; 51:310:318
Signal Modeling: Realizations of Smallpox Epicurve Each symptomatic case a random draw “maximum likelihood” epicurve
Assessing Algorithm Performance Summary processing: measure dependence of sensitivity or timeliness on false alert rate (ROC or AMOC curves or key sample values at practical rates) Sensitivity/Specificity as a function of threshold: Receiver Operating Characteristic (ROC) Detection Probability (sensitivity) threshold False Alert Rate (1 – specificity) Timeliness/Specificity as a function of threshold: Activity Monitor Operating Characteristic (AMOC) Timeliness Score (e.g. Mean or Median Time to Alert) threshold False Alert Rate (1 – specificity)
Quality Control Charts and Health Surveillance • Benneyan JC, Statistical Quality Control Methods in Infection Control and Hospital Epidemiology, Infection and Hospital Epidemiology, Vol. 19, (3)194-214 • Part I: Introduction and Basic Theory • Part II: Chart use, statistical properties, and research issues • 1998 Survey article gives 135 references • Many applications: monitoring surgical wound infections, treatment effectiveness, general nosocomial infection rate, … • Monitoring process for “special causes” of variation • Organize data into fixed-size groups of observations • Look for out-of-control conditions by monitoring mean, standard deviation,… • General 2-phase procedure: • Phase I: Determine mean m, standard deviation s of process from historical “in-control” data; control limits often set to m 3s • Phase II: Apply control limits prospectively to monitor process graphically
Adaptation of Traditional Process Control to Early Outbreak Detection On adapting statistical quality control to biosurveillance: Woodall , W.H. (2000). “Controversies and Communications in Statistical Process Control”, Journal of Quality Technology 32, pp. 341-378. • “Researchers rarely…put their narrow contributions into the context of an overall SPC strategy. There is a role for theory, but theory is not the primary ingredient in most successful applications.” Woodall , W.H. (2006, in press). “The Use of Control Charts in Health Care Monitoring and Public Health Surveillance” • “In industrial quality control it has been beneficial to carefully distinguish between the Phase I analysis of historical data and the Phase II monitoring stage” • “It is recommended that a clearer distinction be made in health-related SPC between Phase I and Phase II…” Does infectious disease surveillance require an “ongoing Phase I” strategy to maintain robust performance?
Statistical Process Control in Advanced Disease Surveillance Key application issues: • Background data characteristics change over time • Hospital/clinic visits, consumer purchases not governed by physical science, engineering • But monitoring requires robust performance: algorithms must be adaptive • Target signal: effect of infectious disease outbreak • Transient signal, not a mean shift • May be sudden or gradual
The Challenge of Data Modeling for Daily Health Surveillance • Conventional scientific application of regression • Do covariates such as age, gender affect treatment? Does treatment success of differ among sites if we control for covariates? • Studies use static data sets with exploratory analysis • In surveillance, we model to predict data levels in the absence of the signal of interest • Need reliable estimates of expected levels to recognize abnormal levels • Data sets dynamic—covariate relationships change
The Challenge of Data Modeling for Daily Health Surveillance, cont’d Modeling to generate expected data levels • Predictive accuracy matters, not just strength of association or overall goodness-of-fit • For a gradual outbreak, recent data can “train” model to predict abnormal levels Alerting decisions based on model residuals Residual = observed value – modeled value Conventional approach: • assume residuals fit a known distribution (normal, Poisson,…) • hypothesis test for membership in that distribution For surveillance, can also apply control-chart methods to residuals
Monitoring Data Series with Systematic Features • Problem: How to account for short-term trends, cyclic data features in alerting decisions? • Approaches • Data Modeling • Regression: GLM, ARIMA, others & combinations • Signal Processing • LMS filters and wavelets • Exponential Smoothing: generalizes EWMA
Example: OTC Purchasing Behavior Influenced by Many Factors Loglinear Regression Example: Tracking Daily Sales of Flu Remedies Log(Y) = b0 + b1-6d + b7t + b8-9h +b10w + b11p + e daily count of anti-flu sales day of week (6 indicators) weather (temp.) deviation (Poisson dist.) linear trend harmonic (seasonal) sales promotion (indicator)
Recent Surveillance MethodBased on Loglinear Regression Modeling emergency department visit patterns for infectious disease complaints: results and application to disease surveillance Judith C Brillman , Tom Burr , David Forslund , Edward Joyce , Rick Picard and Edith Umland BMC Medical Informatics and Decision Making 2005, 5:4, pp 1-14 http://www.biomedcentral.com/content/pdf/1472-6947-5-4.pdf Modeling visit counts on day d: Let S(d) = log ( visits(day d) + 1 ), the “started log” S(d) = [Σi ci× Ii(d)] + [c8 + c9 × d] + [c10 × cos(kd) + c11 × sin(kd)], k = 2π/ 365.25 c1-c7 day-of-week effects c9 long-term trend c10-c11 seasonal harmonic terms Training period: 3036 days ~ 8.33 years Test period: 1 year
EWMA Monitoring • Exponential Weighted Moving Average • Average with most weight on recent Xk: Sk = wS k-1 + (1-w)Xk, where 0 < w < 1 • Test statistic: Sk compared to expectation from sliding baseline Basic idea: monitor (Sk – mk) / sk • Added sensitivity for gradual events • Larger w means less smoothing
Brown, R.G. and Meyer, R.F. (1961), "The Fundamental Theorem of Exponential Smoothing," Operations Research, 9, 673-685. Exponential smoothing represents “an elementary model of how a person learns”: xk = xk-1 + w (xk - xk-1)where 0 < w < 1 For the smoothed value Sk, Sk = wS k-1 + (1-w)Xk , The variance of Sk is sS = [w / (2 - w)] sX So a smaller w is preferred because it gives a more stable Sk; values between 0.1 and 0.3 often used But Chatfield: changes in global behavior will result in a larger optimal w EWMA Concept & Smoothing Constant
Generalized Exponential Smoothing Holt-Winters Method: modeling level, trend, and seasonality Forecast Function: http://www.statistics.gov.uk/iosmethodology/downloads/ Annex_B_The_Holt-Winters_forecasting_method.pdf where: mj = level at time j, bj = trend at time j, cj = periodic multiplier at time j s = periodic interval k = number of steps ahead and mj, bj, cj are updated by exponential smoothing
Holt-Winters Updating Equations Updating Equations, multiplicative method: Level at time t: Slope at time t: Periodic multiplier at time t: And choice of initial values m0, b0, c0,…cs-1 should be calculated from available data
Forecasting Local Linearity:Automatic vs Nonautomatic Methods Chatfield, C. (1978), "The Holt-Winters Forecasting Procedure," Applied Statistics, 27, 264-279. Chatfield, C.and Yar, M. (1988), "Holt-Winters Forecasting: Some Practical Issues, " The Statistician, 37, 129-140. • “Modern thinking favors local linearity rather than global linear regression in time…” • “Local linearity is also implicit in ARIMA modelling…” • Simple EWMA ~ ARIMA(0,1,1) • EWMA + trend ~ ARIMA(0,2,2) • Multiplicative Holt-Winters has no ARIMA equivalent • “Practical considerations rule out [Box-Jenkins] if there are insufficient observations or …expertise available” • “Box-Jenkins… requires the user to identify an appropriate… [ARIMA] model” For “fair” comparison of H-W to B-J, have both automatic or nonautomatic. Assertion: The simplicity of H-W permits easier classification, requiring less historic data. Can an automatic B-J give robust forecasting over a range of input series types?
Regression vs Holt-Winters Ongoing study with Galit Shmueli, U. of MD Sean Murphy, JHU/APL 30 time series, 700 days’ data 5 cities 3 data types 2 syndromes Respiratory: seasonal & day-of-week behavior Gastrointestinal: day-of-week effects
Data stream(s) to monitor in time: test interval • Counts to be tested for anomaly • Nominally 1 day • Longer to reduce noise, test for epicurve shape • Will shorten as data acquisition improves guardband baseline interval Avoids contamination of baseline with outbreak signal • Used to get some estimate • of normal data behavior • Mean, variance • Regression coefficients • Expected covariate distrib. • -- spatial • -- age category • -- % of claims/syndrome Temporal Aggregation for Adaptive Alerting
Candidate Methods 1. Global loglinear regression of Brillman et. al. 2. Holt-Winters exponential smoothing fixed sets of smoothing parameters for data: with both day-of-week & seasonal behavior with only day-of-week behavior 3. Adaptive Regression Log(Y) = b0 + b1-6d + b7t + b8hol +b9posthol +e 56-day baseline, 2-day guardband b1-6= day-of-week indicator coefficient b7= centered ramp coefficient b8= coefficient for holiday indicator b9=coefficient for post-holiday indicator 1-day ahead and 7-day-ahead predictions
Respiratory Visit Count Data --- Data --- Holt-Winters --- Regression --- Adaptive Regr. All series display this autocorrelation; good test for published regression model
GI Visit Count Data --- Data --- Holt-Winters --- Regression --- Adaptive Regr.
Stratified Residual Comparisons --- Data --- Holt-Winters --- Regression --- Adaptive Regr.
Mean Residual Comparison • When mean residuals favor regression, difference is small, and this difference results from largest residuals • If the holiday terms in adaptive regression are removed, H-W means uniformly smaller
Residual Autocorrelation Comparison --- Data --- Holt-Winters --- Regression --- Adaptive Regr.
Summary • Data-adaptive methods are required for robust prospective surveillance • Appropriate algorithm selection requires an automated data classification methodology, often with little data history • Statistical expertise is required to manage practical issues to maintain required detection performance as datasets evolve: • stationarity (causes rooted in population behavior, evolving informatics, others) • late reporting • data dropouts
Research Directions • Classification of time series for automatic forecasting • Easier for Holt-Winters than for Box-Jenkins? • Determining reliable discriminants: • Autocorrelation coefficients • Simple means/medians • Goodness-of-fit measures • How little startup data history required? • Most effective alerting algorithm using residuals, given signal of interest • Apply control chart to residuals? • Need to detect both sudden, gradual signals • Detection performance constraints: • Minimum detection sensitivity • Maximum background alert rate