1 / 17

Introduction to Machine Learning

Introduction to Machine Learning. Alejandro Ceccatto Instituto de Física Rosario CONICET-UNR. Bibliography. Machine Learning , Tom Mitchell ( McGraw Hill, 1997) Principal Component Analysis , Ian Jolliffe (Springer-Verlag, 2002)

king
Download Presentation

Introduction to Machine Learning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to Machine Learning Alejandro Ceccatto Instituto de Física Rosario CONICET-UNR 1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006

  2. Bibliography Machine Learning, Tom Mitchell (McGraw Hill, 1997) Principal Component Analysis, Ian Jolliffe(Springer-Verlag, 2002) An introduction to SVM and other kernel-based learning methods, Cristianini-Shawe Taylor (Cambrige, 2000) The Elements of Statistical Learning, Hastie-Tibshirani-Friedman (Springer, 2001) 1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006

  3. Machine Learning • The field of Machine Learning is concerned with the question of how to construct computer programs that automatically improve with experience • The purpose of this course is to present key algorithms and theory that form the core of Machine Learning 1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006

  4. Machine Learning • Interdisciplinary nature of the material: Statistics, Artificial Intelligence, Information Theory, etc. • Basic question: How to program computers to learn? 1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006

  5. Machine Learning Intelligent Data Analysis: • Intelligent application of data analytic tools (Statistics) • Application of “intelligent” data analytic tools (Machine Learning) Modern world: Data-driven world (industrial, commercial, financial, scientific activities) 1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006

  6. Why Machine Learning? • Recent progress in algorithms and theory • Growing flood of online data • Computational power available 1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006

  7. Why Machine Learning? • Niches for Machine Learning: • Data Mining: using historical data to improve decisions Medical records  medical knowledge • Software applications we can’t program by hand Autonomous driving Speech recognition • Self customizing programs Newsreader that learns user interests 1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006

  8. Why Machine Learning? • Data Mining • Data: Recorded facts • Information: Set of patterns, or expectations, that underlie the data • Data Mining: Extraction of implicit, previously unknown, and potentially useful information from data • Machine Learning: Provides the technical basis of data mining 1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006

  9. Why Machine Learning? • Typical Datamining Tasks • Risk of Emergency Cesarean Section Given • 9714 patient records, each describing a pregnancy and birth • Each patient record contains 215 features Learn to predict: • Classes of patients at high risk for emergency cesarean section 1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006

  10. Why Machine Learning? 1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006

  11. Why Machine Learning? One of the learned rules: IF No previous vaginal delivery, and Abnormal 2nd Trimester Ultrasound, and Malpresentation at admission THEN Probability of Emergency C-Section 0.6 Over training data: 16/41=0.63 Over Test Data: 12/20=0.60 1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006

  12. Why Machine Learning? • Credit Risk Analysis 1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006

  13. Why Machine Learning? • Customer Retention 1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006

  14. Why Machine Learning? • Problems Too Difficult to Program by Hand 1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006

  15. Why Machine Learning? • Software that Customizes to User 1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006

  16. Where is This Headed? Today: tip of the iceberg • First-generation algorithms: neural nets, decision trees, regression.... • Applied to well-formated databases Tomorrow: enormous impact • Learn across mixed-media data and multiple databases • Learn by active experimentation • Learn decisions rather than predictions • Cumulative, life-long learning 1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006

  17. Where is This Headed? Autonomous entities? “I'm sorry Dave; I can't let you do that.” –HAL 9000 in 2001: A Space Odyssey, by Arthur Clarke 1er. Escuela Red ProTIC - Tandil, 18-28 de Abril, 2006

More Related