1 / 10

Machine Learning Theory

Machine Learning Theory. Lecture 1, August 23 rd 2011. Maria-Florina (Nina) Balcan. Machine Learning. Image Classification. Document Categorization. Speech Recognition. Protein Classification. Spam Detection. Branch Prediction. Fraud Detection. Playing Games.

vevelyn
Download Presentation

Machine Learning Theory

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Machine Learning Theory Lecture 1, August 23rd 2011 Maria-Florina (Nina) Balcan

  2. Machine Learning Image Classification Document Categorization Speech Recognition Protein Classification Spam Detection Branch Prediction Fraud Detection Playing Games Computational Advertising

  3. Goals of Machine Learning Theory Develop and analyze models to understand: • what kinds of tasks we can hope to learn, and from what kind of data • what types of guarantees might we hope to achieve • prove guarantees for practically successful algs (when will they succeed, how long will they take?); • develop new algs that provably meet desired criteria Interesting connections to other areas including: • Combinatorial Optimization • Algorithms • Game Theory • Probability & Statistics • Complexity Theory • Information Theory

  4. Example: Supervised Classification Decide which emails are spam and which are important. Supervised classification Not spam spam Goal: use emails seen so far to produce good prediction rule for future data.

  5. + + - + - + - - - - Example: Supervised Classification Represent each message by features. (e.g., keywords, spelling, etc.) example label Reasonable RULES: Predict SPAM if unknown AND (money OR pills) Predict SPAM if 2money + 3pills –5 known > 0 Linearly separable

  6. Two Main Aspects of Supervised Learning Algorithm Design. How to optimize? Automatically generate rules that do well on observed data. Confidence Bounds, Generalization Guarantees, Sample Complexity Confidence for rule effectiveness on future data. Well understood for passive supervised learning.

  7. Other Protocols for Supervised Learning • Semi-Supervised Learning Using cheap unlabeled data in addition to labeled data. • Active Learning The algorithm interactively asks for labels of informative examples. Theoretical understanding severely lacking until a couple of years ago. Lots of progress recently. We will cover some of these. • Learning with Membership Queries • Statistical Query Learning

  8. Structure of the Class Passive Supervised Learning • Basic models: PAC, SLT. • Simple algos and hardness results for supervised learning. • Standard Sample Complexity Results (VC dimension) • Weak-learning vs. Strong-learning • Classic, state of the art algorithms: AdaBoost and SVM (kernel based mehtods). • Modern Sample Complexity Results • Rademacher Complexity • Margin analysis of Boosting and SVM

  9. Structure of the Class Other Learning Paradigms • Incorporating Unlabeled Data in the Learning Process. • Incorporating Interaction in the Learning Process: • Active Learning • Learning with Membership Queries Other Topics • Classification noise and the Statistical-Query model • Learning Real Valued Functions • Online Learning and Game Theory • connections to Boosting

  10. Admin http://www.cc.gatech.edu/~ninamf/ML11/ • Course web page: • 4-5 hwk assignments. Exercises/problems (pencil-and-paper problem-solving variety). • Project: explore a theoretical question, try some experiments, or read a couple of papers and explain the idea. Short writeup and possibly presentation. Small groups ok. • Take-home exam. [50%] [35%] [15%]

More Related