1 / 18

Support Vector Machines

Support Vector Machines. Andrew Horwitz MUMT621. A brief history…. Developed by Vladimir Vapnik in the 90s at AT&T Standard algorithm developed by Vapnik and Corinna Cortes a few years later Used for image analysis: identification of characters and human form recognition.

haig
Download Presentation

Support Vector Machines

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Support Vector Machines Andrew HorwitzMUMT621

  2. A brief history… Developed by Vladimir Vapnik in the 90s at AT&T Standard algorithm developed by Vapnik and Corinna Cortes a few years later Used for image analysis: identification of characters and human form recognition.

  3. How would you classify this data? Black dots = +1 White dots = 0/-1 (technically boolean values) (graph adapted from Moore 2003)

  4. Optimal Hyperplane–“Maximum Margin Linear Classifier” • Goal is to find the line farthest from the closest points • The line is placed equidistant from those points • “Margin” refers to the distance from the closest black point(s) to the closest white point(s)(graph taken from Moore 2003)

  5. Plus Planes and Minus Planes (graphic taken from Moore 2003) • For the boolean interpretation: • Black dot support vectors = “plus plane”, white = “minus” • These are lines on a Cartesian plane… • wx + b • These are parallel lines… • Rescale the graph…

  6. Plus Zones and Minus Zones (graphic taken from Moore 2003) • For a value w and a value b: • Plus-plane = { x:wx+ b = +1 }, plus values > 1 • Minus-plane = {x:wx+ b = -1 }, minus values < -1

  7. Calculating the Margin • Keep in mind, this set is linearly separable… • Let’s take a point x- on the minus plane and the closest point x+ on the plus plane. • W is perpendicular to these planes; we can say that x+ =x- +λw for some value λ. (graphic taken from Moore 2003)

  8. Calculating the Margin • We know: • w*x+ + b = 1 • w*x-+ b = -1 • x+ =x- +λw • Using the first and third: • w*(x- +λw) + b = 1 • w*x- + b + λ(w*w) = 1 • -1 + λ(w*w) = 1 • λ = 2/(w*w) (graphic taken from Moore 2003)

  9. Calculating the Margin • Given: • λ = 2/(w*w) • Margin width = |λw| • |λw| = λ*sqrt(w*w) • = 2*sqrt(w*w) (w*w) • = 2 sqrt(w*w) (graphic taken from Moore 2003)

  10. Conclusions about the 2D perfect case Width of the margin is This is great! This is also reverse-engineering! How do we get it from the data? And what happens if a point is misclassified?

  11. Perceptron algorithm Perceptron algorithm: cycle through (x, y) pairs and keep adjusting w and b each time. If the dataset is separable by a hyperplane, w will eventually converge. If it is, this takes a long time; if it isn’t, w will never stabilize.

  12. Lagrange Multipliers • Looking at a 1D example on right: it is not linearly divisible. • Add a new variable(the Lagrange multiplier), and go through every (x, y) datapoint and adjust w accordingly. • In 2D 3D, this can result in a non-linear classifier and turn out overly complicated. (above taken from Moore 2003) (above taken from Dogan 2008)

  13. Soft Margin (Graphic taken from Zisserman 2014)

  14. Uses in MIR • Playlist creation via relevance comparison (Mandel et al.) • Machine trains SVM on user-labeled examples. • SVM considers six characteristics of the songs • Based on a seed song from their dataset, presents user with songs that are farthest from the “decision boundary.” • Compared what users thought of style, artist, and mood similarities based off SVM results’ ordering

  15. Uses in MIR (cont.) • Emotion detection (Zhou et al.) • User searches “I feel happy today and I need a song” – “tough” queries • Compares lyrics to tags to lyric-and-tag searches • Found that SVM models were most effective and that lyric-and-tag searches were better than lyric or tag

  16. My use • Computer vision! • Examples are from human form recognition • Positive weights: likely that segment belongs to a human • Negative: opposite • 16*8 sections of the image * 8 possible orientations = R1024 • OpenCV (images from Ramanan through Zisserman)

  17. SVM Resources http://www.autonlab.org/tutorials/svm15.pdf http://docs.opencv.org/doc/tutorials/ml/introduction_to_svm/introduction_to_svm.html http://www.kernel-machines.org/Thanks!

  18. Conclusions About the 2D Perfect Case • Keep in mind, this set is linearly separable… • Imagine two points on the hyperplane, xa and xb • w*xa+b = w*xb+b = 0 • Then: w*(xa – xb) = 0, w must be perpendicular to the optimal hyperplane • Let’s calculate the margin…

More Related