1 / 32

Intelligent Data Analysis (IDA)

Intelligent Data Analysis (IDA). by Josipa Kern , PhD Andrija Stampar School of Public Health Medical School University of Zagreb Zagreb, Croatia. Interest and Excitement for Intelligent Data Analysis. Decision making is asking for information and knowledge Data processing can give them

duane
Download Presentation

Intelligent Data Analysis (IDA)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Intelligent Data Analysis (IDA) by Josipa Kern, PhD Andrija Stampar School of Public Health Medical School University of Zagreb Zagreb, Croatia

  2. Interest and Excitement for Intelligent Data Analysis • Decision making is asking for information and knowledge • Data processing can give them • Multidimensionality of problems is looking for methods for adequate and deep data processing and analysis

  3. Learning Objectives • To understand the concept of the IDA • To meet web-sites and literature on IDA • To meet some tools for IDA • To learn how to use IDA tools and to validate the IDA results

  4. Performance Objectives • Recognize problems asking for IDA • Preparing data and making analysis • Validating and interpreting results of IDA

  5. IDA is… … an interdisciplinary study concerned with the effective analysis of data; … used for extracting useful information from large quantities of online data; extracting desirable knowledge or interesting patterns from existing databases;

  6. IDA or … • Data mining • Knowledge acquisition from data • Genetic algorithm-based rule discovery • Knowledge discovery • Learning classifier system • Machine learning • etc.

  7. IDA gives knowledge …

  8. Knowledge is … • the distillation of information that has been collected, classified, organized, integrated, abstracted and value-added; • at a level of abstraction higher than the data, and information on which it is based and can be used to deduce new information and new knowledge; • usually in the context of human expertise used in solving problems.

  9. Knowledge acquisition … • The process of eliciting, analyzing, transforming, classifying, organizing and integrating knowledge and representing that knowledge in a form that can be used in a computer system.

  10. Knowledge in a domain can be expressed as a number of rules

  11. Rule is … A formal way of specifying a recommendation, directive, or strategy, expressed as "IF premise THEN conclusion" or "IF condition THEN action".

  12. How to discover rules hidden in the data?

  13. Some tools for IDA … • See5- program for analyzing data and generating classifiers in the form of decision trees and/or rule sets. http://www.rulequest.com

  14. Some tools for IDA … • Cubist- analyzes data and generates rule-based piecewise linear models – collections of rules, each with an associated linear expression for computing a target value.. http://www.rulequest.com

  15. Some tools for IDA … • ILLM- the tool constructs classification models in the form of rules which represent knowledge about relations hidden in data. http://dms.irb.hr

  16. Some tools for IDA … • Magnum Opus- finds association rules providing competitive advantage by revealing underlying interactions between factors within the data. http://www.rulequest.com

  17. Evaluation of IDA results • Absolute & relative accuracy • Sensitivity & specificity • False positive & false negative • Error rate • Reliability of rules • Etc.

  18. Example of IDA Illustration of IDA by using See5

  19. See5…application… • application.names- lists the classes to which cases may belong and the attributes used to describe each case. • Attributes are of two types: discrete attributes have a value drawn from a set of possibilities, and continuous attributes have numeric values.

  20. See5…application… • application.data- provides information on the training cases from which See5 will extract patterns. • The entry for each case consists of one or more lines that give the values for all attributes.

  21. See5…application… • application.test- provides information on the test cases (used for evaluation of results). • The entry for each case consists of one or more lines that give the values for all attributes.

  22. See5…application…example… • Epidemiological study (1970-1990) • Sample of examinees died from cardiovascular diseases during the period • Question: Did they know they were ill? 1 – they were healthy 2 – they were ill (drug treatment, positive clinical and laboratory findings)

  23. See5…application…example… • application.names – example Goal. gender:M,F activity:1,2,3 age: continuous smoking: No,Yes … Goal:1,2 …

  24. See5…application…example… • application.data – example M,1,59,Yes,0,0,0,0,119,73,103,86,247,87,15979,?,?,?,1,73,2.5 M,1,66,Yes,0,0,0,0,132,81,183,239,?,783,14403,27221,19153,23187,1,73,2.6 M,1,61,No,0,0,0,0,130,79,148,86,209,115,21719,12324,10593,11458,1,74,2.5 … …

  25. See5…application…example… • Results – example Rule 1: (cover 26) gender = M SBP > 111 oil_fat > 2.9 -> class 1 [0.929]

  26. See5…application…example… • Results – example Rule 4: (cover 14) smoking = Yes SBP > 131 glucose > 93 glucose <= 118 oil_fat <= 2.9 -> class 2 [0.938]

  27. See5…application…example… • Results – example Rule 15: (cover 2) SBP <= 111 oil_fat > 2.9 -> class 2 [0.750]

  28. See5…application…example… • Results – example Evaluation on training data (199 cases):   (a) (b) <-classified as ---- ---- 107 3 (a): class 1 17 72 (b): class 2

  29. See5…application…example… • Results – example (training set) Sensitivity=0.97 Specificity=0.81

  30. See5…application…example… • Results – example Evaluation on test data (73 cases): (a) (b) <-classified as ---- ---- 43 1 (a): class 1 3 26 (b): class 2

  31. See5…application…example… • Results – example (test set) Sensitivity=0.98 Specificity=0.90

  32. All the suggested IDA tools are available at mentioned URLs, at least as demo version Try your own IDA…Thank you!

More Related