1 / 55

Lecture 1: Introduction

Lecture 1: Introduction. Machine Learning CUNY Graduate Center. Today. Welcome Overview of Machine Learning Class Mechanics Syllabus Review Basic Classification Algorithm. My research and background. Speech Analysis of Intonation Segmentation Natural Language Processing

joben
Download Presentation

Lecture 1: Introduction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 1: Introduction Machine Learning CUNY Graduate Center

  2. Today Welcome Overview of Machine Learning Class Mechanics Syllabus Review Basic Classification Algorithm

  3. My research and background • Speech • Analysis of Intonation • Segmentation • Natural Language Processing • Computational Linguistics • Evaluation Measures • All of this research relies heavily on Machine Learning

  4. You • Why are you taking this class? • For Ph.D. students: • What is your dissertation on? • Do you expect it to require Machine Learning? • What is your background and comfort with • Calculus • Linear Algebra • Probability and Statistics • What is your programming language of preference? • C++, java, or python are preferred

  5. Machine Learning Data Learning Algorithm Behavior ≥ Data Programmer Behavior Automatically identifying patterns in data Automatically making decisions based on data Hypothesis:

  6. Machine Learning in Computer Science Speech/Audio Processing Planning Robotics Natural Language Processing Locomotion Machine Learning Vision/Image Processing Biomedical/Chemedical Informatics Financial Modeling Human Computer Interaction Analytics

  7. Major Tasks • Regression • Predict a numerical value from “other information” • Classification • Predict a categorical value • Clustering • Identify groups of similar entities • Evaluation

  8. Feature Representations Our Focus Entity in the World Feature Representation Machine Learning Algorithm Feature Extraction Web Page User Behavior Speech or Audio Data Vision Wine People Etc. How do we view data?

  9. Feature Representations

  10. Classification OR Identify which of N classes a data point, x, belongs to. xis a column vector of features.

  11. Target Values Goal of Classification Identify a function y, such that y(x) = t In supervised approaches, in addition to a data point, x, we will also have access to a target value, t.

  12. Feature Representations

  13. Graphical Example of Classification

  14. Graphical Example of Classification ?

  15. Graphical Example of Classification ?

  16. Graphical Example of Classification

  17. Graphical Example of Classification

  18. Graphical Example of Classification

  19. Decision Boundaries

  20. Regression Goal of Classification Identify a function y, such that y(x) = t • Regression is a supervised machine learning task. • So a target value, t, is given. • Classification: nominal t • Regression: continuous t

  21. Differences between Classification and Regression • Similar goals: Identify y(x) = t. • What are the differences? • The form of the function, y (naturally). • Evaluation • Root Mean Squared Error • Absolute Value Error • Classification Error • Maximum Likelihood • Evaluation drives the optimization operation that learns the function, y.

  22. Graphical Example of Regression ?

  23. Graphical Example of Regression

  24. Graphical Example of Regression

  25. Clustering • Clustering is an unsupervised learning task. • There is no target value to shoot for. • Identify groups of “similar” data points, that are “dissimilar” from others. • Partition the data into groups (clusters) that satisfy these constraints • Points in the same cluster should be similar. • Points in different clusters should be dissimilar.

  26. Graphical Example of Clustering

  27. Graphical Example of Clustering

  28. Graphical Example of Clustering

  29. Mechanisms of Machine Learning • Statistical Estimation • Numerical Optimization • Theoretical Optimization • Feature Manipulation • Similarity Measures

  30. Mathematical Necessities • Probability • Statistics • Calculus • Vector Calculus • Linear Algebra • Is this a Math course in disguise?

  31. Why do we need so much math? • Probability Density Functions allow the evaluation of how likely a data point is under a model. • Want to identify good PDFs. (calculus) • Want to evaluate against a known PDF. (algebra)

  32. Gaussian Distributions We use Gaussian Distributions all over the place.

  33. Gaussian Distributions We use Gaussian Distributions all over the place.

  34. Class Structure and Policies • Course website: • http://eniac.cs.qc.cuny.edu/andrew/gcml-11/syllabus.html • Google Group for discussions and announcements • http://groups.google.com/gcml-spring2011 • Please sign up for the group ASAP. • Or put your email address on the sign up sheet, and you will be sent an invitation.

  35. Data Data Data • “There’s no data like more data” • All machine learning techniques rely on the availability of data to learn from. • There is an ever increasing amount of data being generated, but it’s not always easy to process. • UCI • http://archive.ics.uci.edu/ml/ • LDC (Linguistic Data Consortium) • http://www.ldc.upenn.edu/

  36. Half time. Get Coffee. Stretch.

  37. Decision Trees color green blue brown h w w <66 <150 <140 w w h h m f <66 <145 <64 <170 m m f m f f f m Classification Technique.

  38. Decision Trees color green blue brown h w w <66 <150 <140 w w h h m f <66 <145 <64 <170 m m f m f f f m Very easy to evaluate. Nested if statements

  39. More formal Definition of a Decision Tree A Tree data structure Each internal node corresponds to a feature Leaves are associated with target values. Nodes with nominal features have N children, where N is the number of nominal values Nodes with continuous features have two children for values less than and greater than or equal to a break point.

  40. Training a Decision Tree How do you decide what feature to use? For continuous features how do you decide what break point to use? Goal: Optimize Classification Accuracy.

  41. Example Data Set

  42. Baseline Classification Accuracy • Select the majority class. • Here 6/12 Male, 6/12 Female. • Baseline Accuracy: 50% • How good is each branch? • The improvement to classification accuracy

  43. Training Example color green blue brown 2M / 2F 2M / 2F 2M / 2F 50% Accuracy before Branch 50% Accuracy after Branch 0% Accuracy Improvement Possible branches

  44. Example Data Set

  45. Training Example height <68 5M / 1F 1M / 5F 50% Accuracy before Branch 83.3% Accuracy after Branch 33.3% Accuracy Improvement Possible branches

  46. Example Data Set

  47. Training Example weight <165 5M 1M / 6F 50% Accuracy before Branch 91.7% Accuracy after Branch 41.7% Accuracy Improvement Possible branches

  48. Training Example weight <165 5M height <68 5F 1M / 1F Recursively train child nodes.

  49. Training Example weight <165 5M height <68 weight 5F <155 1M 1F Finished Tree

  50. Generalization • What is the performance of the tree on the training data? • Is there any way we could get less than 100% accuracy? • What performance can we expect on unseen data?

More Related