1 / 50

Department of Computer Science and Engineering

Real-Time Clinical Warning for Hospitalized Patients via Data Mining (数据挖掘实现的住院病人的实时预警). Department of Computer Science and Engineering Yixin Chen (陈一昕) , Yi Mao, Minmin Chen, Rahav Dor , Greg Hackermann , Zhicheng Yang, Chengyang Lu School of Medicine

aviv
Download Presentation

Department of Computer Science and Engineering

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Real-Time Clinical Warning for Hospitalized Patients via Data Mining (数据挖掘实现的住院病人的实时预警) Department of Computer Science and Engineering Yixin Chen (陈一昕), Yi Mao, Minmin Chen, RahavDor, Greg Hackermann, Zhicheng Yang, ChengyangLu School of Medicine Kelly Faulkner, Kevin Heard, Marin Kollef, Thomas Bailey

  2. Background • The ICU direct costs per day for survivors is between six and seven times those for non-ICU care. • Unlike patients at ICUs, general hospital wards (GHW) patients are not under extensive electronic monitoring and nurse care. • Clinical study has found that 4–17% of patients will undergo cardiopulmonary or respiratory arrest while in the GHW of hospital.

  3. Project mission Sudden deteriorations (e.g. septic shock, cardiopulmonary or respiratory arrest) of GHW patients can often be severe and life threatening. Goal: Provide early detection and intervention based on data mining to prevent these serious, often life-threatening events. Using both clinical data and wireless body sensor data A NIH-ICTS funded project: currently under clinical trials at Barnes-Jewish Hospital, St. Louis, MO

  4. What exactly do we predict Is he going to die?

  5. What exactly do we predict Is he going to ICU?

  6. System Architecture • Tier 1: EWS (early warning system) • Clinical data, lab tests, manually collected, low frequency • Tier 2: RDS (real-time data sensing) • Body sensor data, automatically collected, wirelessly transmitted, high frequency

  7. Agenda 1 3 5 2 Early warning system (EWS) Background and overview Real-time data sensing (RDS) Future work

  8. Medical Record (34 vital signs: pulse, temperature, oxygen saturation, shock index, respirations, age, blood pressure …) Time/second Time/second

  9. Related Work Medical data mining medical knowledge machine learning methods SCAP and PSI Acute Physiology Score, Chronic Health Score , and APACHE score are used to predict renal failures Modified Early Warning Score (MEWS) decision trees neural networks SVM Main problems : Most previous general work uses a snapshot method that takes all the features at a given time as input to a model, discarding the temporal evolving of data

  10. Overview of EWS Goal: Design an data mining algorithm that can automatically identify patients at risk of clinical deterioration based on their existing electronic medical records time-series. Challenges: • Classification of high- dimensional time series data • Irregular data gaps • measurement errors • class imbalance

  11. Key Techniques in the EWS Algorithm Temporal bucketing Discriminative classification Bootstrap aggregating (bagging) Exploratory under-sampling Exponential moving average smoothing Kernel-density estimation

  12. Workflow of the System

  13. Data Preprocessing Outlier removal Normalization

  14. Temporal Bucketing Bucket 1 Bucket 2 Bucket 3 Bucket 4 Bucket 5 Bucket 6 We retain data in a sliding window of the last 24 hours and divided it evenly into 6 buckets In order to capture temporal variations, we compute several feature values for each bucket, including the minimum, maximum,and average

  15. Discriminative Classification • Logistic regression (LR) • Support vector machine (SVM) • Use max, min, and avgof each bucket and each vital sign as the input features. (~ 400 features in total) • Use the training data to learn the model parameters. Clinical data Data preprocessing Temporal Bucketing Classification Algo. Output Model, Threshold

  16. Aggregated Bootstrapping (bagging) Advantages: 1. Handles outliers 2. Avoid over-fitting 3. Better model quality

  17. Biased Bucket Bagging

  18. Exploratory Undersampling

  19. Exponential Moving Average (EMA)

  20. Evaluation Criteria AUC (Area Under receives operating characteristic (ROC) Curve) represents the probability that a randomly chosen positive example is correctly rated with greater suspicion than a randomly chosen negative example.

  21. Results on Historical Database At specificity=0.95 1: bucketing + logistic regression 2: bucketing + logistic regression + bagging 3: bucketing + logistic regression + bucket bagging 4: bucketing + logistic regression + biased bucket bagging 5: bucketing + logistic regression + biased bucket bagging + exploratory undersampling

  22. Comparison of various models

  23. Clinical Trial at Barnes-Jewish Hospital Alerts already triggered early prevention that may prevented deaths

  24. Agenda 1 3 5 2 Background & Related work Future work Early warning system (EWS) Real-time data sensing (RDS)

  25. Overview of RDS • A challenging problem • Classification based on multiple high-frequency real-time time-series (heart rate, pulse, oxygen sat., CO2, temperature, etc.)

  26. Wireless Sensor Network at BJH

  27. Overview of Learning Algorithm Key techniques: Feature extraction from multiple time series Feature selection Classificationalgorithms Exploratory undersampling

  28. A Large Pool of Features Features: • Detrended fluctuation analysis (DFA) features • Approximate entropy (ApEn) • Spectral features • First-order features • Second-order features • Cross-sign features

  29. Detrended Fluctuation Analysis (DFA) DFA is a method for quantifying the statistical self-affinity of a time-series signal. (See: e.g., Peng et al. 1994) Applicable to both pulse rate and SpO2

  30. Spectral Analysis (FFT) Used component values of VLF (<0.04Hz), LF (0.04-0,15HZ), HF (0.15-0.4HZ), and the ratio LF/HF for each signal.

  31. Other Features Approximate Entropy (ApEn): It quantifies the unpredictability of fluctuations in a time series. A low value  deterministic A high value  unpredictable First Order Features: Mean, standard deviation skewness (symmetry of distribution), Kurtosis (peakness of distribution) Second Order Features: related to co-occurrence of patterns First quantify a time series into Q discrete bins, then construct a pattern matrix energy (E), entropy (S), correlation (COR), inertia (F), local homogeneity (LH), Cross-sign features: link multiple vital signs together Correlation: the degree of departure of two signals from independence Coherence: amplitude and phase about the frequencies held in common between two signals

  32. Forward Feature Selection Empty Feature Set Current Feature Set Pick one feature to add into the set Evaluate each of the remaining features (if no improvement) Final feature set

  33. Experimental Setup Dataset: MIMIC-II (Multiparameter Intelligent Monitoring in Intensive Care II): A public-access ICU database The data model can be used for both GHW patients with sensors and ICU patients Our data: between 2001 and 2008 from a variety of ICUs (medical, surgical, coronary care, and neonatal) Prediction goal: death or survival Real-time vital signs: heart rate and oxygen saturation rate Class imbalance: most patients survived Evaluation: Based on a 10-fold cross validation

  34. Result – Linear and Nonlinear Classification LSVM: Linear SVM LR: Logistic Regression KSVM: RBF Kernel SVM 1: DFA of Heart Rate 2: DFA of Oxygen Saturation

  35. Result – Feature Combinations

  36. Result – Feature Selection LR is our first choice: better AUC, interpretability, efficiency

  37. Result – Our Final Model Method 1: Logistic Regression + all features Method 2:Logistic Regression + all features + exploratory undersampling Method 3:Logistic Regression + feature selection + exploratory undersampling

  38. Current Work: Density-based LR Standard logistic regression φk(x) = xk: P(y=1|x) = 1/(1 + exp( - ∑ wk xk)) Probability of an event (e.g., ICU, death) grows or decreases monotonically with each feature Not true in many case: e.g., ICU transfer rate vs. age Ideas: transform each feature xk

  39. Current Work: Density-based LR Use a kernel-density estimator to estimate p(xk, y=1) and p(xk, y=0) for each feature xk Resulting in a nonlinear separation plane that conforms to the true distribution of data Advantages over KLR, SVM Efficiency, interpretability

  40. Example of Density-based LR Test Data: Original LR Density-based LR

  41. Future Work Distance-based classification algorithms for multi-dimensional time-series Dynamic time warping, information distance Combination of feature-base and distance-based classification algorithms Include distance information in the objective function Combining Tier-1 and Tier-2 data Multi-kernel methods Interpretation of alerts Based on the magnitude and sign of model coefficients

  42. Acknowledgement

  43. Real-Time Simulation on Historical Data @ Specificity=0.95

  44. (Assuming feature Independence)

  45. Let each be the bucket sample that is independently drawn from . is the predictor. The aggregated predictor is: The average prediction error in is: The error in the aggregated predictor is: Using the inequality gives us . Why Bagging Works?

  46. Algorithm details – Biased Bucket bagging (BBB) Standard deviation A critical factor deciding how much bagging will improve accuracy is the variance of these bootstrap models. We see that BBB with 4 buckets has the largest difference between and . Besides this, BBB with 4 buckets also has the highest standard deviations in predict results. So we choose BBB with 4 buckets as the final method.

  47. Algorithm Details –Bucket Bagging

  48. Result on Real-Time System We can see that all cases attain best performance when is around 0.06, showing that the choice of is robust. This small optimal value shows that historical records plays an important role for prediction. Cross validation for the EMA parameter

More Related