260 likes | 272 Views
Machine learning methods in solving the old problem: heart rate variability in healthy people. Danuta Makowiec * and Joanna Wdowczyk** * Institute of Theoretical Physics and Astrophysics University of Gdańsk **First Chair and Clinic of Cardiology Medical University of Gdańsk.
E N D
Machine learning methods in solving the old problem:heart rate variability in healthy people Danuta Makowiec* and Joanna Wdowczyk** *Institute of Theoretical Physics and Astrophysics University of Gdańsk **First Chair and Clinic of Cardiology Medical University of Gdańsk.
brief summary • Aims: • whether machine learning methods: the exploratory data analysis and classification could be successuful in explaining the meaning of heart rate variability measures, and show they change with biological aging. • Conclusions : • the new methods offer us versatile validations for our old results and intuitions • they allow us to practice comprehensively, test many aspects in an unlimited way, also on many time scales and in different models • which is extremely profitable, they offer attractive presentation of results • results of the exploratory factor analysis indeed satisfy our expectations • methods of classifications used by us seem to fail. Danuta Makowiec, 32nd Marian Smoluchowski Symposium, Krakow, 18-20 September 2019
Signals: heartbeat time series Phenomenon: heart rate variability RR-intervals: RR-actions: symbolization of signals by actions Furthermore: let Then represents the space of quantified two subsequent actions represents the space of quantified three subsequent actions Danuta Makowiec, 32nd Marian Smoluchowski Symposium, Krakow, 18-20 September 2019
Dynamic landscape by entropic measures of quantified actionsdiversity Shannon entropy: Dynamic entropy: transition rates self-transfer entropy Danuta Makowiec, 32nd Marian Smoluchowski Symposium, Krakow, 18-20 September 2019
The dynamic landscape measures have proven to be informative in discerning the heartbeat abnormalities in • patients after heart transplantation (J Wdowczyk et.alFrontiers in Physiology (2018) 9:274 • elderly healthy people (D Makowiec et al., Computers inCardiology (2018) 45:243 Danuta Makowiec, 32nd Marian Smoluchowski Symposium, Krakow, 18-20 September 2019
Fragmentation of heart rate variability M Costa, RB Davies, AL. Goldberg(Frontiers in Physiology (2017) 8:255 ) proposed to investigate statistics of breaks in a signal. A new concept for describing of heart rate variability arose, called fragmentation. Breaks can be quantified by counting short segments with the actions of acceleration and deceleration. In particular, one can consider: It has occurred that the degree of fragmentation of heartbeat time series increases with the participant age in a group of healthy elderly individuals. This phenomenon is called the erratic rhythm (after PK Stein et.al , Computers in Cardiology , 669 (2002)) Danuta Makowiec, 32nd Marian Smoluchowski Symposium, Krakow, 18-20 September 2019
structure of qualified fragmentation segments by partial entropy Novelproposition:partialentropy of QF PIP PAS PSS Danuta Makowiec, 32nd Marian Smoluchowski Symposium, Krakow, 18-20 September 2019
Aims: Whether and if yes,what new information about heartbeats dynamics we can learn from the entropy (total / partial) of quantified fragmentationsegments Especially, what new information about indices of heartbeats series we can discover by using methods of machine learning Danuta Makowiec, 32nd Marian Smoluchowski Symposium, Krakow, 18-20 September 2019
Feature space : the space of considered HRV indices 2 2 4 4 3 6 5 6 1 --- F=33 General: meanRR, meanHR Long term: stdRR, sd2 Short term: pNN50, pNN20, RMSSD, sd1 Frequency: total, rVLF, rLF, rHF Fragmentation: PIP, PAS, PSS Partial fragmentation:ad, da, ada, dad, aaa, ddd Dynamic landscape: E_3, E_2, E_1, S_T, sTE, Partial entropy : eaaa, eddd, edad, eada, ead, eda no action 0 Standard HRV Danuta Makowiec, 32nd Marian Smoluchowski Symposium, Krakow, 18-20 September 2019
M. Zarczyńska-Buchowiecka, J. Wdowczyk First Chair and Clinic of Cardiology Medical Universtiy of Gdańsk Data acquisition 181 series with 24-hour Holter recordings from healthy people at different age and sex. Four hours of normal heartbeats with RR-intervals of nocturnal time were extracted by the high qualified cardiologists. 20-ties: 17 females 13 males 30-ties: 11 10 40-ties : 13 20 50-ties: 13 18 60-ties: 12 15 70-ties: 10 12 80-ties: 11 6 • Editing procedure: • Anaction of size larger than 300ms was replaced by the same action of 300mssize • Gaps in the data of size 1 or 2 were filled with medians from the surrounding [-3, +3 ] neighbors Danuta Makowiec, 32nd Marian Smoluchowski Symposium, Krakow, 18-20 September 2019
the space = ( time series x features ) H = 181 F =33 [age, sex, meanRR, …………………………………, no action] healthy_1 healthy_2 healthy_H 20 f 1032 ……………….. 0.137 20 f 888 ……………….. 0.062 80 m 934 ……………………… 0.192 H x F We can study features of full recordings – 240 minutes or features from parts of recordings – 60 min, 30min, 15min, 5min, 2min, 1min 1. Statistics of features in 240-minute recordings 2. Statistics of features in 5-minute segments : 48 items from the same person 3. Statistics of extremes of 5-minute features corresponding to : min (HR) : case od deep sleep, min(std RR): case of basic ANS control 4. All statistics were tested on shuffled signals Danuta Makowiec, 32nd Marian Smoluchowski Symposium, Krakow, 18-20 September 2019
factor_analyzer package Jeremy Biggs (jbiggs@ets.org) Factor Analysis to display stochastic relations between features: 240min segments General meanRR meanHR • Overall system stability • Humoral mechanisms • Sympathetic NS activity • Balance in ANS • Vagal NS activity (respiration) meanRR 0.375 0.281 0.022 0.772 0.185 meanHR -0.365 -0.246 -0.015 -0.785-0.111 stdRR 0.502 0.220 0.275 0.206 0.757 sd2 0.462 0.216 0.286 0.215 0.772 total0.689 -0.089 0.200 0.057 -0.434 rVLF-0.699 0.007 -0.272 0.300 0.367 rLF -0.066 -0.121 0.754 -0.092 -0.058 rHF0.697 0.073 -0.367 -0.191 -0.274 pNN50 0.910 0.154 -0.012 0.092 0.235 pNN200.923 0.157 0.087 0.291 0.057 RMSSD 0.929 0.183 0.058 0.029 0.248 sd10.929 0.183 0.058 0.029 0.248 E_30.914 0.162 0.206 0.251 0.048 E_2 0.945 0.160 0.142 0.204 0.114 E_1 0.947 0.184 0.126 0.188 0.120 S_T0.939 0.134 0.159 0.220 0.107 sTE0.925 0.111 -0.028 0.071 0.277 n_zero-0.848 -0.170 -0.147 -0.353 0.027 PSS -0.096 0.353 -0.926 -0.035 -0.066 ddd 0.018 -0.465 0.775 0.104 -0.036 eddd 0.135 -0.427 0.797 0.142 0.009 eaaa 0.294 -0.175 0.861 0.001 0.208 aaa 0.160 -0.210 0.896 -0.043 0.165 PAS 0.073 0.955 -0.263 0.082 0.080 edad 0.341 0.851 -0.221 0.182 0.081 dad 0.128 0.890 -0.296 0.150 0.077 ada 0.024 0.931 -0.230 0.021 0.082 eada 0.241 0.912 -0.167 0.071 0.083 PIP 0.530 0.693 -0.390 0.247 0.016 ead0.8090.457 -0.227 0.257 0.076 ad 0.514 0.646 -0.427 0.302 0.004 da 0.534 0.726 -0.343 0.183 0.030 eda0.8130.526 -0.145 0.145 0.096 Long term stdRR sd2 Frequency short term standard pNN50, pNN20, RMSSD, sd1 entropic E_3, E_2, E_1, S_T, sTE, ead, eda antyfragmen- tation rLF, PSS, aaa, ddd, eddd, eaaa fragmentation PAS, PIP, ad, da dad,ada
Visualization of stochastic relations between features True signals Danuta Makowiec, 32nd Marian Smoluchowski Symposium, Krakow, 18-20 September 2019
Visualization of stochastic relations between features True signals Shuffled signals Danuta Makowiec, 32nd Marian Smoluchowski Symposium, Krakow, 18-20 September 2019
Visualization of stochastic relations between features Shuffled signals True signals
Graph of relations: real vs statistics correlation distance threshold =0.2 True signals General Fragmentation Shuffled signals Frequency antifragmentation Long term Entropic short term E_3, E_2, E_1, S_T, sTE, ead, eda pNN50, pNN20, RMSSD, sd1 Danuta Makowiec, 32nd Marian Smoluchowski Symposium, Krakow, 18-20 September 2019
Visualization of stochastic relations between features : 5 min segments 33 features and 48x181= 8688 „patients’”. Real signals Shuffled signals Danuta Makowiec, 32nd Marian Smoluchowski Symposium, Krakow, 18-20 September 2019
Visualization of stochastic relations between features : 5 min segments Shuffled signals True signals 33 features and 48x181= 8688 „patients’”. Danuta Makowiec, 32nd Marian Smoluchowski Symposium, Krakow, 18-20 September 2019
Graph of relations between entropic indices 5min 240min 5min min std Shuffled min HR
Conclusions from the exploratory data analysis • the new methods have offered us not only versatile validations for our old results and intuitions but also have provided new insights into the data gathering: • Five factors identified by EFA have physiological meaning • Deep sleep can be associated with the low HR • Life means order but to see this order we need sufficiently long signals • Short signals may overestimate the role of fluctuations. • Short signals can only be allowed under controlled conditions. • they allow you to practice comprehensively, test many aspects in an unlimited way, also on many time scales and in different models • which is extremely profitable, they offer attractive presentation of results Danuta Makowiec, 32nd Marian Smoluchowski Symposium, Krakow, 18-20 September 2019
Support Vector Machines as a tool for data classification from scikit learn: sklearn.svm.SVC sklearn.decomposition.IncrementalPCA sklearn.manifold.Isomap Our goal: accuracy of the popular machine learning procedures in establishing the classification rules for aging healthy people. • Projection to a lower dimensional space by: • PCA (Principal Component Analyses) a linear method (singular value decomposition of variability of features) for dimensionality reduction • IsoMap (Isometric mapping): a nonlinear dimensionality reduction method of geodesic distances for similarity between data points (so not features) ,( k-nearest neighbors for the manifold then multi-dimensional scaling algorithm) Danuta Makowiec, 32nd Marian Smoluchowski Symposium, Krakow, 18-20 September 2019
SVM classifi-cation based on all features 5 min score: 2918(8688) = 36% min_std score 88 (181)= 48% min_HR score: 83(181) =45% Danuta Makowiec, 32nd Marian Smoluchowski Symposium, Krakow, 18-20 September 2019
score =40% SVM classification : inside
SVM classification based on entropicfeatures:E_3, E_2, E_1, S_T, sTE,eddd, eaaa, edad, eada, ead, ada Danuta Makowiec, 32nd Marian Smoluchowski Symposium, Krakow, 18-20 September 2019
Conclusions from the classification • Results of automatic classification are not satisfactory, seem to be incidental. • Perhaps this is the effect of • limited sizes of considered classes • complex nonlinear relations which are involved in the real classification Does it mean that we are disappointed with the methods used? Not at all! Biological aging seems to be too complicated for standard statistical methods. There is still much work for us -- HUMANS The work is in progress. Results I hope to present at the next Smoluchowski meeting. Thankyou! Danuta Makowiec, 32nd Marian Smoluchowski Symposium, Krakow, 18-20 September 2019
Danuta Makowiec, 32nd Marian Smoluchowski Symposium, Krakow, 18-20 September 2019