1 / 28

W07 The Discovery Challenge on Thrombosis Data

W07 The Discovery Challenge on Thrombosis Data. Data Provider Katsuhiko TAKABAYASHI MD Chiba University Hospital, Japan. A nti- P hospholipid antibody S yndrome (APS). Anti-cardiolipin antibodies (aCL) Lupus anticoagulant (LAC) Induces thrombotic events

kaia
Download Presentation

W07 The Discovery Challenge on Thrombosis Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. W07 The Discovery Challenge on Thrombosis Data Data Provider Katsuhiko TAKABAYASHI MD Chiba University Hospital, Japan

  2. Anti-Phospholipid antibody Syndrome (APS) • Anti-cardiolipin antibodies (aCL)Lupus anticoagulant (LAC) • Induces thrombotic events (such as AMI, Stroke, deep venous thrombosis, miscarriage, pulmonary hypertension etc.) • sometimes positive in other collagen diseases (Lupus, Sjoegren syndrome)

  3. TF XI XII VII VIIa IX TF/VIIa XIa XIIa IXa Ca++ VIIIa VIII X Xa Ca++ PL Va V prothrombin thrombin fibrinogen fibrin XIIIa XIII Ca++ Ca++ Mg++ Ca++ Mg++ PLCa++ Mg++ ProteinC ProteinC

  4. Collagen disease SLE APS RA

  5. CollagenDiseases • Autoimmune disease • Rheumatic disease • Connective tissue disease autoantibodies APS Thombosis ; Vessel stasis by blood clots Myocardial Infarction, Stroke etc.

  6. Thrombosis • Whoin APS ? • When ? • Which laboratory data have relations with thrombosis as well as anti-cardiolipin antibodies ?

  7. The Goal of this trial (1) Assessment of validity of each study • If a data mining technique can point out important key factors (aCL, LAC, PT, APTT) which are already known to be related with thrombosis properly from many variants we provided.

  8. The Goal of this trial (2) The Results to expect 1) to identify high risk patients who have no history of thrombosis so far. 2) to predict the time of thrombosis or detect the change of some variants in the course of thrombosis from the series of temporal data.

  9. Evaluation of the results From the current medical point of view,. • Common sense results (positive control) • Probable results • Possible results • unclear results, difficult to evaluate • Nonsense results (negative control)

  10. We cannot judge what we do not know ! The study most results of which have good accordance with current knowledge The study most results of which have low accordance with current knowledge Domain researchers cannot say that other unclear results are also true. Domain researchers cannot believe the rest of unclear results ! Assessment in domain field

  11. MedicalData Set • Medical data set here is from 1241 patients with collagen diseases and 7 basic laboratory data for aCL from 806 cases were provided. • As for temporal laboratory data, 41 items in 57,543 tests totally in 17 years were prepared. • Seventy-six cases had some thrombotic events in their clinical course.

  12. Evaluations from medical aspects The bridge theory ; GeneticProgramming Coursac I et al • It can predict patients’ health state from spe-exams and lab-exams in 99.28%. • CNS lupus has a relation with anti-DNA Ab level and IgM type aCL. • aCL IgM and anti-DNA Ab levels are related independently with the thrombosis in the future.

  13. Evaluations from medical aspects Boulicaut et al δ- strong classification rules • a lot of rules with 100% confidence, but most of them were not useful. • A rule that aCL >2.4 and range of aCL IgM from 1.9 to 2.7 and KCT (-) is SLE. • The rule that sex is M and ANA is 0 is Behcet. • We would like to look at the other rules not written here to find attractive ones.

  14. Evaluations from medical aspects CRISP ( cross-industry standard process ) Jensen S et al • LAC, ANA, U-pro, centromere-type, SSA, SSB,RNP,SM,SCl-70 were strong contributors to predict the presence of thrombosis. • Other possibilities of thrombosis without aCL antibodies.

  15. Evaluations from medical aspects Jensen S et al • Sequential analysis for temporal data did not show interesting results. • It might be difficult to predict the time of thrombosis. One possibility is that the data might be modified by the treatment or prophylaxis.

  16. Bias by physicians • Modification of treatment • Selection of the cases, laboratory data

  17. Evaluations from medical aspects • determined a discriminate function that separates occurrences of thrombosis with very low false negatives. • However, .....is it possible to translate the meaning and make us understood ? • Weightening? Werner J and Fogarty T genetic programming

  18. When the Results Beyond Expert’s Knowledge ability • Complicated relations might be difficult to be explained. No drug relations for three items were tried. • The results through a black box might be ignored by the experts simply because it can not make them understood!

  19. Evaluations from medical aspects SQL ; cross contingency classification • reasonable results as Infozoom. • ANA pattern analysis • Patients with severe attacks have more possibilities of other attacks. • Thrombosis related with the level of aCLs. • Alveolar hemorrhage and CNS attacks are not associated with milder attacks. Zytkow J and Gupta S

  20. Evaluations from medical aspects • Beilken and Spenke (InfoZoom) : by using user friendly interface, easy to understand their test results. They could choose the reasonable and interesting rules. • Levin: by using Wizwhy producing 7356 rules. Complicated rules are difficult to comment because of its complexity. • Taylor : from temporal data missing data disturbed the analysis. Only common sense findings were selected.

  21. Evaluations from medical aspects SQL ; cross contingency classification • reasonable results as Infozoom. • ANA pattern analysis • Patients with severe attacks have more possibilities of other attacks. • Thrombosis related with the level of aCLs. • Alveolar hemorrhage and CNS attacks are not associated with milder attacks. Zytkow J and Gupta S

  22. To obtain the good results efficiently (1) Cleaning of data • Preprocessing the data is very essential by domain researchers who concerned with the database to minimize the noises. • Definition, classification, adjustment etc. • Recognition of the modification by the treatment or prophylaxis. • Indication to treat missing data

  23. To obtain the good results efficiently (2) • To involve medical knowledge as possible with the data set in the beginning • To cooperate with domain researchers to obtain domain knowledge during data mining. Introduction of the domain knowledge

  24. Causal Relation Misjudge in temporary meaning Backward and non-objective relationships Bacteria invades Pneumonia occurs Bacteria has invaded Pneumonia occurs Bacteria will invade Pneumonia occurs

  25. To obtain the good results efficiently (3) • An interactive technique will avoid user’s discontent of a black box and assist to drive to the right direction. • Hypothetico-deductive method will be easily accepted by physicians. Cooperation with domain researchers

  26. Causal Relation Misjudge in temporary meaning Backward and non-objective relationships It rains The road is wet. It rains The road is wet. It will rain The road is wet.

  27. Data mining • Retrospective approach ; not arranged, many noises. • Data ; More genuine and adequate data set must be prepared. Terms, definitions and background must be introduced beforehand. • Rules ; Complicated rules (relations between more than 3 items) found by this analysis cannot be explained nor proved whether they are true from medical approach. • 3種の薬剤の治験はない

  28. Bias By Physicians • Modification of treatment • Selection of the cases, laboratory data • change of the disease; before and after the events (thrombosis) By Accident

More Related