360 likes | 568 Views
Three Stages of Evaluation for Syndromic Surveillance from Chief Complaint Classification . Wendy W Chapman, PhD John N Dowling, MD, MS Oleg Ivanov, MD, MPh, MS Bob Olszewski, PhD Michael M Wagner, MD, PhD. Introduction. Syndromic surveillance from chief complaints becoming common
E N D
Three Stages of Evaluation for Syndromic Surveillance from Chief Complaint Classification Wendy W Chapman, PhD John N Dowling, MD, MS Oleg Ivanov, MD, MPh, MS Bob Olszewski, PhD Michael M Wagner, MD, PhD
Introduction Syndromic surveillance from chief complaints becoming common • Chief complaints are ubiquitous • Chief complaints are early Can we detect outbreaks by monitoring chief complaints that are classified into syndromic categories?
Outline • Describe a three-staged approach for answering that question • Describe a body of research applying the three-staged approach • Discuss what we have learned by applying the three-stage approach
Technical Accuracy Diagnostic Accuracy Outcome Efficacy Does the systemdo what it is trained to do? Does the systemdiagnose patientscorrectly? Does the systemimprove outcomes? Yes Yes No No Stop Stop Stages of Evaluation in Medical Technology Development
Technical Accuracy Case Detection Outbreak Detection Does the CC classifier accurately assign syndromic categories? Does the syndromic category represent the patient’s state? Can we detect outbreaks from chief complaints? Yes Yes No No Stop Stop Three Stages of Evaluation in Syndromic Surveillance
Stage 1: Technical Accuracy Can we accurately classify a chief complaint string into a syndromic category? • Determine whether automated application performs its task • Reference Standard • IS: Expert classification of chief complaint string • IS NOT: Patient’s actual syndrome
Chief Complaint Classifier Gold Standard Gold Standard Syndromic Category Classifier Syndromic Category Compare CC Classifier Performance Chief Complaint
Stage 2: Case Classification Does the syndromic classification from the chief complaint accurately represent the patient’s clinical state? • Results reflect quality of • Chief complaint classifier • Chief complaint content • Reference Standard is patient’s actual syndrome • Bulk of our work has been in Case Classification • More informative than Technical Accuracy • Easier to evaluate than Outbreak Detection
Different than in Evaluation of Technical Accuracy Gold Standard Chief Complaint Classifier Gold Standard Syndromic Classification Classifier Syndromic Classification Compare Diagnostic Accuracy Medical Records of Test Cases Chief Complaints of Test Cases
Stage 3: Outbreak Detection Can we detect outbreaks by monitoring chief complaint classifications? • Outcome metrics • Accuracy • Timeliness • Reference Standard is an outbreak • Most difficult evaluation to perform Outbreaks are rare
Chief Complaint Classifier Standard Outbreak Detection Methods Syndromic Categories for Chief Complaints Syndromic Outbreak Detection Algorithms Defined Outbreak Detected Outbreak Compare Accuracy and Timelines of Outbreak Detection Chief Complaints for Population Population
Chief Complaint Classifier CoCo: naïve Bayesian classifier CoCo “SOB/cough” Respiratory 0.97 GI 0.00 Constitut 0.01 Rash 0.00 Hemorrhagic 0.00 Botulinic 0.00 Neurological 0.00 Other 0.02 • CoCo is open source • openrods.sourceforge.net • CoCo can be trained on any syndromic categories using manually classified chief complaints
Results • Technical Accuracy • Case Classification • Outbreak Detection
TechnicalAccuracy Case Classification Outbreak Detection How well can we classify chief complaints into syndromes? • Text Processing Application CoCo • Test Set 28,990 chief complaints from Utah • Reference standard Physician classifications of chief complaints • Outcome measure Area under the ROC curve (AUC) * Olszewski RT. Bayesian classification of triage diagnoses for the early detection of epidemics. In: Recent Advances in Artificial Intelligence: Proceedings of the Sixteenth International FLAIRS Conference;2003:412-416.
TechnicalAccuracy Case Classification Outbreak Detection Results: General Syndromes
TechnicalAccuracy Case Classification Outbreak Detection How well can we identify specific findings in chief complaints? • Text Processing Application Keyword searches • Reference standard Physician identification of findings in chief complaints • Outcome measure Standard test statistics
TechnicalAccuracy Case Classification Outbreak Detection Results: Specific Syndromes and Findings
TechnicalAccuracy Case Classification Outbreak Detection How Well Can We Identify Syndromic Cases from Chief Complaints • Text Processing Application CoCo • Test Set 527,228 patients at University of Pittsburgh Medical Center (UPMC) • Reference standard Primary ICD-9 discharge diagnosis Syndromic lists of ICD-9 codes • Outcome measure Standard test statistics
Mean = 62% Results: Syndromic Case Classification Syndrome # pos. cases Accuracy Sensitivity Specificity PPV NPV
TechnicalAccuracy Case Classification Outbreak Detection How Well Can We Identify Cases of Specific Syndromes from Chief Complaints? • Text Processing Application CoCo + keyword searches • Reference standard Physician review of ED report • Outcome measure Standard test statistics
TechnicalAccuracy Case Classification Outbreak Detection Results: Case Classification of Specific Syndromes
TechnicalAccuracy Case Classification Outbreak Detection How Well Can We Identify Outbreaks from Chief Complaints? • Text Processing Application CoCo* • Reference standard Pediatric respiratory illness outbreaks (bronchiolitis, RSV) Pediatric gastrointestinal illness outbreaks (Rotavirus) • Outcome measure Exponentially Weighted Moving Average (EWMA) detection algorithm for timeliness Standard test statistics for accuracy * Ivanov O, Gesteland PH, Hogan W, Mundorff MB, Wagner MM. Detection of pediatric respiratory and gastrointestinal outbreaks from free-text chief complaints. AMIA Annu Symp Proc. 2003:318-22.
Respiratory Outbreaks (n = 3) • Timeliness: 10.3 days earlier • Sensitivity: 100% • Specificity: 100% • GI Outbreaks (n = 3) • Timeliness: 29 days earlier • Sensitivity: 100% • Specificity: 100%
Discussion Are classified chief complaints good enough to use for syndromic surveillance? Three phases of evaluation importantin answering this question
Technical Accuracy Evaluations How well do classification methods perform? • CoCo quite good and very simple • Keyword searches for fever, diarrhea, vomiting have perfect accuracy
Case Classification Evaluations How well can we identify syndromic cases from CC’s? • Sensitivity: 30% (botulinic) to 75% (Hemorrhagic) • PPV: 12% to 44% • Outbreak must be larger to be detected Which syndromes are best? • Respiratory Syndrome – sensitivity 63% • Febrile respiratory syndrome – sensitivity 22% • GI Syndrome – sensitivity 69% • Diarrhea – sensitivity 11% • Vomiting – sensitivity 15% • Should not make syndromic definitions too narrow
Outbreak Detection Evaluations Can we detect outbreaks with classified chief complaints? • Can detect pediatric respiratory and GI outbreaks • Chief complaints contain signal for outbreaks • Chief complaint signal is earlier than that of ICD-9 diagnoses
Summary • Our research over the last few years aimed at answering question of how well we can detect outbreaks from chief complaints • Three stages of evaluation are important in understanding the answer • Optimize and focus effort • Make evaluation more feasible • Evaluate question from different angles • Provide insight into related practical questions Which syndromes are best?
Thank You http://rods.health.pitt.edu/ • NLP
Future Work Chief Complaints • Improve CoCo • Synonym replacement, spell checking • Split into multiple chief complaints • Different chief complaint classification methods • M+ outperforms CoCo in technical accuracy • Have not tested M+ in diagnostic accuracy yet • Improve reference standard for Diagnostic Accuracy evaluations • 1,600 cases of 7 syndromes with physician judgment from ED notes • Increase number of outcome efficacy studies • Real outbreaks Other Clinical Data • Chest radiograph reports • ED Reports
TechnicalAccuracy DiagnosticAccuracy OutcomeEfficacy How Well Can We Identify Influenza Cases from Chief Complaints? • Text Processing Applications • CoCo – Respiratory • CoCo – Constitutional • Reference standard • Primary ICD-9 discharge diagnoses for Influenza • Outcome measure • Standard test statistics