1 / 13

Machine Learning Group

Machine Learning Group. group leader: Prof. Olga Štěpánková members: Dr. Jiří Kléma, Dr. Filip Železný, Lenka Nováková, Michal Jakob, Pavel Novák http://gerstner.felk.cvut.cz/machine-learning/. The Gerstner laboratory for intelligent decision making and control, Czech Technical University.

lin
Download Presentation

Machine Learning Group

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Machine Learning Group group leader: Prof. Olga Štěpánková members: Dr. Jiří Kléma, Dr. Filip Železný, Lenka Nováková, Michal Jakob, Pavel Novák http://gerstner.felk.cvut.cz/machine-learning/ The Gerstner laboratory for intelligent decision making and control, Czech Technical University Workshop on Intelligent and Adaptive Systems in Medicine, Mar 31-Apr 1, 2003

  2. Introduction • Research: centers around Machine Learning and its applications in Data Mining, we teach principles of both fields in several courses. • Theory: basic ML principles, such as Instance-Based Learning, Inductive Logic Programming and various probabilistic/randomization techniques. • Applications: several real-life projects, namely in the medical domain (heart-surgery mortality prediction), in industry (intelligent fault-diagnosis), and telecommunications (tracing patterns in callers' behaviour). • Development: ML/DM systems for both practical and experimental purposes.

  3. Research Streams • Probabilistic Reasoning in Relational Learning - learning hypotheses in first-order logic (field known as Inductive Logic Programming) and Bayesian Inference • Data Preprocessing for Machine Learning and Data Mining - adapting a data preprocessing tool for Inductive Logic Programming and other ML tools • Learning in Multi-Agent Systems - ability to improve the future performance of the total MA system, a part of it, or a single agent • Instance-Based Learning- automated optimization of IBL predictive and classification systems

  4. Projects • Data Mining and Decision Support for Business Competitiveness: A European Virtual Enterprise, Sol-Eu-Net, (IST - 1999 - 11495), January 2000- March 2003, http://soleunet.ijs.si, (O.Štěpánková) • Data-Mining and Decision Support Integration, CTU 0209013, 2002, (J. Kléma) • KDnet (team member: CTU) • European Network on Intelligent Technologies for Smart Adaptive Systems, Eunite (IST-2000-29207), 2001-2003, (team member: CTU) • Inductive Logic Programming Network of Excellence, ILPNet2 (INCO Network of Excellence 977 102) (team member: CTU)

  5. Selected Publications (1) • O. Štěpánková, J. Kléma, P. Mikšovský: Applying Sumatra TT and RAMSYS: Prediction of Resources for a Health Farm. To appear in Data Mining and Decision Support: Integration and Collaboration. To be published by Kluwer in 2003. • O. Štěpánková, P. Aubrecht, Z. Kouba, P. Mikšovský: Preprocessing for Data Mining and Decision Support. To appear in Data Mining and Decision Support: Integration and Collaboration. To be published by Kluwer in 2003. • J. Kléma, F. Železný, O. Štěpánková: Strojové učení a dobývání znalostí z dat, chapter in Artificial Intelligence (4) book, In Czech. To be published by Academia Publishers in 2003. • F. Železný: Two probabilistic approaches to first-order theory induction (PhD. Thesis, 2003). • J. Kléma: Prototype Applications of Instance-Based Reasoning (PhD. Thesis, 2002).

  6. Selected Publications (2) • J. Kléma, J. Kubalík, J. Palouš: Optimized Model Tuning in Medical Systems. In: Proceedings - Computer-Based Medical Systems. New York : IEEE Computer Society Press, 2002, vol. 1. • F. Železný, O. Štěpánková: Efektivní převod multirelační database na jednorelační reprezentaci. In Proceedings Znalosti 2003, Ostrava : VŠB-TUO, 2003, vol. 1. • Štěpánková, O. - Klema, J. - Lauryn, Š. - Mikšovský, P. - Nováková, L. Data Mining for Resource Allocation:A Case Study. In: Knowledge and Technology Integration in Production and Services. New York : Kluwer Academic / Plenum Publishers, 2002. • Železný, F. Learning Functions from Imperfect Positive Data. In: Inductive Logic Programming. Berlin : Springer, 2001, vol. 1, p. 248-259. ISBN 3-540-42538-1.

  7. Research Partners • Rockwell Automation, USA - pump fault diagnostics • Grundfos, Denmark - intelligent pump diagnostics • TeleDataElectronics, Germany - prediction of gas consumption • CertiCon, CZ - OPS and SPS predictive tools, medical diagnostics • IKEM Prague, CZ - heart-surgery mortality prediction • Atlantis Telecom, CZ - data mining in telephony • University of Maribor, System Design Laboratory, Slovenia - decision support in medical systems

  8. Developed Systems • iBARET (Instance-BAsed REasoning Tool) - a universal tool for modelling and predicting in domains described by a vector of numeric or symbolic values • PreDO (PREcisely Defined Objects) - a system that generates experimental data for training and testing of ML algorithms • CIDeT (Clustering and Induction of DEcision Trees) - a system for unsupervised learning • RSD - First-Order Feature Construction and Relational Subgroup Discovery

  9. ML and KDD applications (1) • Resource allocation at a spa • Input: relation data (patients ~ 20.000, procedures ~ 40, procedure prescriptions ~ 1.500.000, forbidden procedure combinations ~ x10) • Goals: • project start: exploratory analysis, find interesting patterns or regularities that can help to improve resource allocation and control of the facilities • after analysis: try to predict in advance the overall number of prescriptions of the specific health procedures within a specific time period, identify previously unknown groups of clients exhibiting characteristic behavior or requirements for procedures. • Algorithms and tools: • preprocessing demanding task -> SumatraTT • „regression per partes“ used for prediction • collaborative task solved by several remote teams

  10. ML and KDD applications (1) • Results: • accurate and timely prediction (88% accuracy based on vague client description) • understandable knowledge gained during prediction (groups of clients) • Practical utilization: • aplication/modification for other similar facilities • incorporated into IS developed by Lauryn v.o.s.

  11. ML and KDD applications (2) • Data mining in telecommunications • Task • Analyze the logging file of an enterprise branch telephone exchange • Create descriptions of recognized events • Discover frequent patterns in events • Visualize data • Solution • Learn event descriptions from generated event examples • Decompose structured logging data into multiple relations • Apply descriptive and predictive multi-relational machine learning algorithms, such as Inductive Logic Programming, as well as visualization techniques

  12. Telecomm.Traffic LoggingData Telephone Exchange Rules EventReconstruction EventDescriptions Prediction Interconnection of parts of the DM/DS system. Machine learning algorithms are applied in the red boxes. ML and KDD applications (2)

  13. ML and KDD applications (2) • Results • Most of events successfully recognized • Insight into telecommunication habits in the enterprise • Rules with predictive nature plugged-in for decision support of the exchange operator

More Related