210 likes | 312 Views
Learning rules from multisource data for cardiac monitoring. Elisa Fromont* , René Quiniou, Marie-Odile Cordier DREAM, IRISA, France. * work supported by the French National Net for Health Technologies as a member of the Cepica project. Chronicle base. Rule base. 0. 1000. 2000. 3000.
E N D
Learning rules from multisource data for cardiac monitoring Elisa Fromont*, René Quiniou, Marie-Odile Cordier DREAM, IRISA, France *work supported by the French National Net for Health Technologies as a member of the Cepica project AIME'05
Chronicle base Rule base 0 1000 2000 3000 4000 Context (ICCU/ [Calicot03]) Cardiac Arrhythmias Learning for Intelligent Classification of On-line Tracks QRS(normal) QRS(abnormal) QRS(normal) t t0 t1 t2 t3 t4 P(normal) P(normal) Signal abstraction: raw data (ECG) symbolic descriptions Chronicle recognition Arrhythmia On line Off line Symbolic transformation Inductive learning Signal data base [Moody96] AIME'05
Motivations • Why learning rules ? A knowledge acquisition module can relieve experts of that time-consuming task [Morik00] • Why using Inductive Logic Programming (ILP) ? • First order rules are easily understandable by doctors • Relational learning allows to take into account temporal constraints ( chronicles) • Why using multiple sources ? Information from a single source is not always sufficient to give a precise diagnosis (noise, complementary information, etc.) Update Calicot for multisource management AIME'05
Multisource data 2 ECG channels, 1 hemodynamical channel: 3 views of the same phenomenon ECG Chan II : (P, QRS) Sensor 1 ECG Chan V : (QRS) Sensor 2 Sensor 3 ABP Chan AIME'05
Monosource learning with ILP • From • A set of examples E defined on LE labeled by a class c C • For each class c, E+ = {(ek,c)| k = 1,m} are the positive examples and E- = {(ek,c’)| k = 1,m, c c’} are the negative examples • A bias B that defines the language LH of the hypotheses looked for • A Background knowledge BK defined on L = LH LE • Find for each class, a set of hypotheses H such that : • H BK E+ (H covers all the positive examples) • H BK E- (H covers no negative example) * * in practice this property is loosen AIME'05
Declarative bias [Bias96] • Grammar to define : • the language (specify the vocabulary to use) • the length of the hypotheses looked for • the order in with consider literals • Mandatory for ILP system such ICL[ICL95] AIME'05
R0R1 R2 P1 Example bigeminy e11 Example bigeminy e21 Example bigeminy e31 Example X* e41 Induction +B +BK … Example Z* en1 *X…Z bigeminy Example of learned monosource rule rule(bigeminy) :- qrs(R0, anormal), p_wav(P1, normal), suc(P1,R0), qrs(R1, normal), suc(R1,P1), qrs(R2, anormal, R1), suc(R2,R1), rr1(R1, R2, short). AIME'05
Example X* e14 Example Z* e1n Example bigeminy e21 Example bigeminy e11 Example bigeminy e21 Example bigeminy e11 Example bigeminy e12 Example bigeminy e22 Example bigeminy e12 Example bigeminy e22 … Example bigeminy e23 Example bigeminy e23 Example bigeminy e13 Example bigeminy e13 Example X* e24 Example X* e24 Example Z* e2n Example Z* e2n Multisource learning : 2 approaches(example on two sources for one class) monosource learning on source 1 Naive multisource learning H1 aggregated examples Example X* e14 Induction +B1 +BK1 … Example Z* e1n monosource learning on source 2 H Induction +B +BK H2 Induction +B2 +BK2 … Consistency : i j (ek,i, c) (ek,j, c’) c = c’ Vote between H1 and H2 ? AIME'05
Naive multisource learning problems When number of sources increases • volume of data increases (aggregation of examples) • expressiveness of language increases the size of the hypothesis search defined byB is bigger than both search spaces defined by B1 and B2 • too much computation time • bad results due to important pruning when looking for hypotheses in the search space AIME'05
Idea : biased multisource learning • Bias efficiently the multisource learning by using : • monosource learned rules • aggregated examples • Difficult to define without background knowledge on the problem create a multisource bias automatically ! AIME'05
H1 L1 bt5 bt1 bt2 bt4 bt3 Algorithm (on two sources) H2 L L2 Lb L : naive multisource language Resulting search space Lb : biased multisource language Li : langage of source i AIME'05
How to construct bti ?(toy example) • class(x):- • p_wave(P0,normal), • diastole(D0,normal), • suci(D0,P0),qrs(R0,normal), • systole(S0,normal), • suci(S0,R0), pr1(P0,R0,normal), • suc(R0,P0), suc(S0,D0). • H1: class(x):- p_wave(P0,normal), qrs(R0,normal), pr1(P0,R0, normal), suc(R0,P0). • H2: class(x):- diastole(D0,normal), systole(S0,normal), suc(S0,D0). Rule fusion + new relational literals … • class(x):- • p_wave(P0,normal), • qrs(R0,normal), pr1(P0,R0,normal), suc(R0,P0), • diastole(D0,normal), suci(D0,R0),systole(S0,normal), • suc(S0,D0). AIME'05
Properties of the biased multisource search space • rules learned with the biased multisource method have an equal or higher accuracy than the monosource rules learned for the same class (in the worst case: vote) • the biased multisource search space is smaller than the naive multisource search space ( DLAB [DLAB97]) • there is no guaranty to find the best multisource solution with the biased multisource learning AIME'05
Examples of learned rules class(svt):- %biased multi qrs(R0),qrs(R1),suc(R1,R0), qrs(R2), suc(R2,R1),rr1(R1,R2,short), rythm(R,R1,R2,regular), qrs(R3), suc(R3,R2),rr1(R2;R3,short), systole(S0), suci(S0,R3), qrs(R4), suci(R4,S0),suc(R4,R3), systole(S1),suc(S1,S0), suci(S1,R4), amp_ss(S0,S1,normal). class(svt):- %ECG qrs(R0),qrs(R1),suc(R1,R0), qrs(R2),suc(R2,R1),rr1(R1,R2,short), rythm(R,R1,R2,regular), qrs(R3), suc(R3,R2),rr1(R2,R3,short), qrs(R4),suc(R4,R3),rr1(R3,R4, short). (covers 2 neg) class(svt):- %ABP systole(S0),systole(S1),suc(S1,S0), amp_ss(S0,S1,normal), systole(S2),suc(S2,S1), amp_ss(S1,S2,normal),ss1(S1,S2,short). (covers 1 neg, does not cover 1 pos) class(svt):- %naive multi qrs(R0), systole(S0), suc(S0,R0), qrs(R1), suc(R1,S0), systole(S1), suc(S1,R1),suc(R1,R0),rr1(R1,R2,short). (covers 12 neg)
Mono source Multi source arrhyt1 bigeminy Source 1 (ECG) Source 2 (ABP) Naive Biased ACC 1 0.998 0.916 1 TestACC 1 0.84 0.7 1 Nb Rules 1 2 2 1 Cardiac cycles 5 3/2 4/2 5 Nb Nodes CPU time 1063 26.99 1023 14.27 22735 3100 657 363.86* Results on the whole database • Database : • small(50) • not noisy • sources are redundant for the studied arrhythmias *include monosource computation times Biased multisource much more efficient than naive multisource No significant improvement from monosource to biased multisource AIME'05
Mono source Mono source Multi source Multi source arrhyt2 svt arrhyt1 ves Source 1 (ECG) Source 1 (ECG) Source 2 (ABP) Source 2 (ABP) Naive Naive Biased Biased ACC ACC 0.96 0.44 0.962 0.94 0.76 0.945 0.98 0.99 TestACC TestACC 0.4 0.94 0.84 0.86 0.76 0.64 0.92 0.9 Rules(H) Rules(H) 1 1 1 1 3 1 1 2 Cardiac cycles Cardiac cycles 5 4 3 5 2 4/4/6 5 8/5 Less informative database(new results without multisource cross validation problems and new constraint on ABP monosource learning) AIME'05
Conclusion Biased multisource vs monosource: better or equal accuracy less complex rules (less rules or less literals) Biased multisource method vs naive method: better accuracy narrower search space reduced computation time Multisource learning can improve the reliability of diagnosis (particularly on complementary data) The biased method allows scalability AIME'05
References [Calicot03] : Temporal abstraction and inductive logic programming for arrhythmia recognition from ECG. G. Carrault, M-O. Cordier, R. Quiniou, F. Wang, AIMed 2003 [Moody96] : A database to support development and evaluation of intensive care monitoring. G.B. Moody et al. Computer in Cardiology 96 [ICL95] : Inductive Constraint Logic (ILP). L. De Raedt et W. Van Laer, Inductive Logic Programming 95 [Bias96] : Declarative bias in ILP. Nedellec et al. Advances in ILP 96 [DLAB97] : Clausal discovery. L. De Raedt, L. Dehaspe, Machine Learning 97 [Morik00] : Knowledge discovery and knowledge validation in intensive care. K. Morik et al. AIMed 2000 AIME'05
Property on aggregated examples Let Hic a hypothesis induced by learning from source i, i [1,s] and the class c C • For all k [1,p], if Hic covers (ei,k, c) then it also covers the aggregated example (ek,c) • For all k [1,n], for all c’ {C-c}, if Hic does not cover (ei,k, c’) and if for all ji, LiLj= then Hic does not cover the aggregated negative example (ek ,c’) AIME'05
Activité électrique du cœur : les éléments de l’apprentissage (voies II et V) AIME'05
Voie hémodynamique Attributs : - amplitude diastole/systole - différence d’amplitude entre diastole et systole - intervalle de temps entre diastole et systole (sd, ds, dd, ss, ….) AIME'05