20 likes | 144 Views
Alan Jović 1 , Karla Brkić 1 , Nikola Bogunović 1. 1 University of Zagreb, Faculty of Electrical Engineering and Computing, Unska 3, 10000 Zagreb, Croatia, E-mail: {alan.jovic, karla.brkic, nikola.bogunovic}@fer.hr. Transformations and feature extraction. Biomedical time-series.
E N D
Alan Jović1, Karla Brkić1, Nikola Bogunović1 1 University of Zagreb, Faculty of Electrical Engineering and Computing, Unska 3, 10000 Zagreb, Croatia, E-mail: {alan.jovic, karla.brkic, nikola.bogunovic}@fer.hr Transformations and feature extraction Biomedical time-series Biomedical time-series datasets Transformations: Characteristics: Fourier transform Hilbert transform Wavelet transform Binary class or multiclass From several features to several hundred features Feature vectors numbers vary Very few open, referential datasets available Decision tree ensembles in biomedical time-series classifaction Biomedical time-series prepared datasets Features: Morphological Statistical Frequency Time-frequency Nonlinear + Personal data Difficult results comparison: Different data Different disorders Different classifiers Goal: Demonstrate the potential of decision tree ensembles in biomedical time series classification, compare to SVM – still preliminary results Three datasets Seven classifiers Classification results Arrhythmia dataset (UCI repository) - 13 classes, 279 features, 452 instances AdaBoost+C4.5 (AB) MultiBoost+C4.5 (MB) Random forest (RF) Rotation forest (RTF) SVM SMO-based - Linear - Squared polynomial - Radial HRV-based arrhythmia (PhysioNet, two databases) (HRV) - 9 classes, 230 features, 8843 instances HRV-based heart disorder (PhysioNet, six databases) (CHF) - 3 classes (normal, arrhytmic, CHF), 237 features, 3317 instances Statistically significant win/loss/tie, α=0.05, Student’s paired t-test for 9x10-fold crossvalidation (first 10-fold iteration used for finding optimal model parameters) Conclusion Preliminary results strongly support the use of decision tree ensembles to improve model accuracy in biomedical time-series classification, especially AdaBoost+C4.5 and MultiBoost+C4.5. Further investigations are necessary. Average classification model construction times (in seconds) for the three datasets