1 / 39

6.S093 Visual Recognition through Machine Learning Competition

6.S093 Visual Recognition through Machine Learning Competition. Joseph Lim and Aditya Khosla Acknowledgment: Many slides from David Sontag and Machine Learning Group of UT Austin. Image by kirkh.deviantart.com. What is Machine Learning???. What is Machine Learning???.

ataret
Download Presentation

6.S093 Visual Recognition through Machine Learning Competition

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 6.S093 Visual Recognition through Machine Learning Competition Joseph Lim and Aditya Khosla Acknowledgment: Many slides from David Sontag and Machine Learning Group of UT Austin Image by kirkh.deviantart.com

  2. What is Machine Learning???

  3. What is Machine Learning??? According to the Wikipedia, Machine learning concerns the construction and study of systems that can learn from data.

  4. What is ML? • Classification • From data to discrete classes

  5. What is ML? • Classification • From data to discrete classes • Regression • From data to a continuous value

  6. What is ML? • Classification • From data to discrete classes • Regression • From data to a continuous value • Ranking • Comparing items

  7. What is ML? • Classification • From data to discrete classes • Regression • From data to a continuous value • Ranking • Comparing items • Clustering • Discovering structure in data

  8. What is ML? • Classification • From data to discrete classes • Regression • From data to a continuous value • Ranking • Comparing items • Clustering • Discovering structure in data • Others

  9. Classification:data to discrete classes • Spam filtering Spam Regular Urgent

  10. Classification:data to discrete classes • Object classification Fries Hamburger None

  11. Regression: data to a value • Stock market

  12. Regression: data to a value • Weather prediction

  13. Clustering: Discovering structure Set of Images

  14. Clustering

  15. Clustering • Unsupervised learning • Requires data, but NO LABELS • Detect patterns • Group emails or search results • Regions of images • Useful when don’t know what you’re looking for

  16. Clustering • Basic idea: group together similar instances • Example: 2D point patterns

  17. Clustering • Basic idea: group together similar instances • Example: 2D point patterns

  18. Clustering • Basic idea: group together similar instances • Example: 2D point patterns

  19. Clustering • Basic idea: group together similar instances • Example: 2D point patterns • What could “similar” mean? • One option: small Euclidean distance • Clustering is crucially dependent on the measure of similarity (or distance) between “points”

  20. K-Means • An iterative clustering algorithm • Initialize: Pick K random points as cluster centers • Alternate: • Assign data points to closest cluster center • Change the cluster center to the average of its assigned points • Stop when no points’ assignments change Animation is from Andrey A. Shabalin’s website

  21. K-Means • An iterative clustering algorithm • Initialize: Pick K random points as cluster centers • Alternate: • Assign data points to closest cluster center • Change the cluster center to the average of its assigned points • Stop when no points’ assignments change

  22. Properties of K-means algorithm • Guaranteed to converge in a finite number of iterations • Running time per iteration: • Assign data points to closest cluster center O(KN) • Change the cluster center to the average of its assigned points O(N)

  23. Example: K-Means for Segmentation K=2 Original Goal of Segmentation is to partition an image into regions each of which has reasonably homogenous visual appearance.

  24. Example: K-Means for Segmentation K=2 K=10 K=3 Original

  25. Pitfalls of K-Means • K-means algorithm is heuristic • Requires initial means • It does matter what you pick! • K-means can get stuck K=1 should be better K=2 should be better

  26. Classification

  27. Typical Recognition System +1 bus classify features -1 not bus x F(x) y • Extract features from an image • A classifier will make a decision based on extracted features

  28. Typical Recognition System +1 bus classify features -1 not bus x F(x) y • Extract features from an image • A classifier will make a decision based on extracted features

  29. Classification • Supervised learning • Requires data, AND LABELS • Useful when you know what you’re looking for

  30. Linear classifier • Which of these linear separators is optimal?

  31. Maximum Margin Classification • Maximizing the margin is good according to intuition and PAC theory. • Implies that only support vectors matter; other training examples are ignorable.

  32. Classification Margin • Distance from example xi to the separator is • Examples closest to the hyperplane are support vectors. • Marginρ of the separator is the distance between support vectors. ρ r

  33. Linear SVM Mathematically • Let training set {(xi, yi)}i=1..n, xi  Rd, yi{-1, 1}be separated by a hyperplane withmargin ρ. Then for each training example (xi, yi): • For every support vector xs the above inequality is an equality. After rescaling w and b by ρ/2in the equality, we obtain that distance between each xs and the hyperplane is • Then the margin can be expressed through (rescaled) w and b as: wTxi+ b ≤ - ρ/2 if yi= -1 wTxi+ b ≥ ρ/2 if yi= 1 yi(wTxi+ b) ≥ ρ/2 

  34. Linear SVMs Mathematically (cont.) • Then we can formulate the quadratic optimization problem: Find w and b such that is maximized and for all (xi, yi), i=1..n : yi(wTxi+ b) ≥ 1 Find w and b such that Φ(w) = ||w||2=wTw is minimized and for all (xi, yi), i=1..n : yi (wTxi+ b) ≥ 1

  35. Is this perfect?

  36. Is this perfect?

  37. Soft Margin Classification • What if the training set is not linearly separable? • Slack variablesξi can be added to allow misclassification of difficult or noisy examples, resulting margin called soft. ξi ξi

  38. Soft Margin Classification Mathematically • The old formulation: • Modified formulation incorporates slack variables: • Parameter C can be viewed as a way to control overfitting: it “trades off” the relative importance of maximizing the margin and fitting the training data. Find w and b such that Φ(w) =wTw is minimized and for all (xi,yi),i=1..n: yi (wTxi+ b) ≥ 1 Find w and b such that Φ(w) =wTw + CΣξi is minimized and for all (xi,yi),i=1..n: yi (wTxi+ b) ≥ 1 – ξi, , ξi≥ 0

  39. Exercise • Exercise 1. Implement K-means • Exercise 2. Play with SVM’s C-parameter

More Related