180 likes | 427 Views
Multivariate Time Series Analysis of Clinical and Physiological Data. Patricia Ordóñez Rozo PhD Candidate University of Maryland, Baltimore County. Overview. Motivation Hypothesis Visualization Work Related Work Proposed Similarity Metric Evaluation Plan. Motivation.
E N D
Multivariate Time Series Analysis of Clinical and Physiological Data Patricia Ordóñez Rozo PhD Candidate University of Maryland, Baltimore County
Overview • Motivation • Hypothesis • Visualization Work • Related Work • Proposed Similarity Metric • Evaluation Plan
Motivation • Technical advances in medicine • 15 - 350 vital signs and lab results per patient (physiological and clinical data) • Need for personalized medicine • Individual differences among humans • Preset ‘general’ thresholds misleading • Methods of data analysis not multivariate
The Hypothesis We hypothesize that it may be possible to: • Create a visualization that will assist providers in examining multivariate patient data over time more accurately and efficiently than current tabular visualizations, • Identify hidden patterns in medical data that would signal significant medical events (such as organ failure) hours in advance, and
The Hypothesis (continued) • Develop a measure of similarity for multivariate time series representations of physiological and clinical electronic data allowing physicians to identify patients with similar events and/or phenotypes for the purpose of predicting patient outcomes.
Pilot Study • Asked 14 residents at St. Agnes Hospital to predict whether the 10 patients went into an episode of acute hypotension • Each used tables and visualization for five patients
Results Accuracy with Tables 57.5% Accuracy with Visualization 52.2% Physionet Challenge 2009 28 submissions 13 had 100% accuracy 9 had 80% accuracy 5 had 60% accuracy 1 had 20% accuracy
Publications on Visualization • Patricia Ordóñez, Marie desJardins, Michael Lombardi, Christoph U. Lehmann, Jim Fackler, An Animated Multivariate Visualization for Physiological and Clinical Data in the ICU in Proceedings of First ACM International Health Informatics Symposium (IHI), Arlington, VA, November 11-12, 2010, to appear. • Christoph U. Lehmann, Patricia Ordóñez, Jim Fackler, Kathryn Holmes Practical Visualization of Multivariate Time Series Data in a Neonatal ICU in Proceedings of Visual Analytics of Health Care (VAHC) Workshop at VisWeek 2010, Salt Lake City, UT, October 24, 2010, to appear. • Patricia Ordóñez, Marie desJardins, Carolyn Feltes, Christoph U. Lehmann, James Fackler, Visualizing Multivariate Time Series Data to Detect Specific Medical Conditions in Proceedings of AMIA (American Medical Informatics Association) 2008 Annual Symposium, 6:530-534(2008). Paper nominated for Student Paper Competition.
Finding Hidden Patterns • Develop a symbolic representation of multivariate time series based on SAX and BOP for univariate time series and other work on multivariate times series data • Create a similarity metric for the representation
Related Work • SAX by Jessica Lin and Eamonn Keogh at UC Riverside • BOP by Jessica Lin at George Mason University • Novel similarity metric for a multivariate time series representation based on a wavelets by Mohammed Saeed and Roger Mark at MIT Laboratory of Computational Physiology
c c c b b b a a - - 0 0 40 60 80 100 120 20 Symbolic Aggregate ApproXimation First convert the time series to Piecewise Aggregate Approximation (PAA) representation. C C 0 20 40 60 80 100 120 Then convert the PAA to SAX symbols. ⅓ ⅓ ⅓ Thanks to Eamonn Keogh and Jessica Lin for use of slide baabccbc
Bag-of-Patterns Representation • Lin and Li • SSDBM 2009 Thanks to Jessica Lin for use of these images
Novel Similarity Metric • Saeed and Mark • AMIA 2006 • Similar multi-parameter physiological time series using wavelet-based symbolic representation at different levels of granularity • Used HR, SBP and cardiac output to predict hemodynamic deterioration
Novel Similarity Metric (cont.) • Used modified information retrieval methods for finding similar time series • Term Frequency Vector (TFV) • Inverse Document Frequency (IDF) Ignored temporal patterns Emphasized multi-scale analysis
Proposed Similarity Metric • Multivariate BOP that crosses the time series HR BCCBBACB RR AABAABAB BACB, CACB,… DeltaBP CCBACCBA MAP BBBBBBBB Histogram of word frequencies using a modified form of TF/IDF incorporating personalized and standardized representations
Evaluation Plan • Study on Patent Ductus Arteriosus (PDA) in neo-natal patients to evaluate the final visualization and the similarity and information retrieval methods using data from NICU patients and by surveying residents • Convert 4000+, annotated ICU patients of the MIMIC II database to our representation and evaluate the modified IR methods in larger medical database
Good Research Requires Good Support • Advisors • Drs. Marie desJardins and Tim Oates (Computer Science) • Dr. Jim Fackler (Medicine) • Committee • Drs. Jessica Lin and Penny Rheingans • Advocates/Mentors • Drs. Wendy Carter, Michael Grasso, Anupam Joshi, Christoph U. Lehmann, Roger Mark, Daniel J. Scott, Janet Rutledge, Renetta Tull, Jorge H. Ordóñez-Smith • Maple, Coral and eBiquity lab mates and classmates • National Science Foundation