220 likes | 321 Views
COMP5331. Other Classification Models: Support Vector Machine (SVM). Prepared by Raymond Wong Presented by Raymond Wong raywong@cse. What we learnt for Classification. Decision Tree Bayesian Classifier Nearest Neighbor Classifier. Other Classification Models.
E N D
COMP5331 Other Classification Models: Support Vector Machine (SVM) Prepared by Raymond Wong Presented by Raymond Wong raywong@cse
What we learnt for Classification • Decision Tree • Bayesian Classifier • Nearest Neighbor Classifier
Other Classification Models • Support Vector Machine (SVM) • Neural Network
Support Vector Machine • Support Vector Machine (SVM) • Linear Support Vector Machine • Non-linear Support Vector Machine
Support Vector Machine • Advantages: • Can be visualized • Accurate when the data is well partitioned
Linear Support Vector Machine x2 w1x1 + w2x2 + b > 0 x1 w1x1 + w2x2 + b = 0 w1x1 + w2x2 + b < 0
Linear Support Vector Machine x2 Support Vector x1 Margin We want to maximize the margin Why?
Linear Support Vector Machine x2 w1x1 + w2x2 + b - D = 0 x1 w1x1 + w2x2 + b = 0 w1x1 + w2x2 + b + D = 0
Linear Support Vector Machine Let y be the label of a point x2 +1 +1 w1x1 + w2x2 + b - 1 0 +1 +1 -1 w1x1 + w2x2 + b - 1 = 0 -1 -1 -1 x1 w1x1 + w2x2 + b + 1 0 w1x1 + w2x2 + b = 0 w1x1 + w2x2 + b + 1 = 0
Linear Support Vector Machine Let y be the label of a point y(w1x1 + w2x2 + b) 1 x2 +1 +1 w1x1 + w2x2 + b - 1 0 +1 +1 -1 w1x1 + w2x2 + b - 1 = 0 -1 -1 y(w1x1 + w2x2 + b) 1 -1 x1 w1x1 + w2x2 + b + 1 0 w1x1 + w2x2 + b = 0 w1x1 + w2x2 + b + 1 = 0
Margin |(b+1) – (b-1)| = 2 = Linear Support Vector Machine Let y be the label of a point y(w1x1 + w2x2 + b) 1 x2 +1 +1 +1 +1 -1 w1x1 + w2x2 + b - 1 = 0 -1 -1 y(w1x1 + w2x2 + b) 1 -1 x1 Margin w1x1 + w2x2 + b + 1 = 0 We want to maximize the margin
2 = Linear Support Vector Machine • Maximize • Subject to • for each data point (x1, x2, y)where y is the label of the point (+1/-1) Margin y(w1x1 + w2x2 + b) 1
Linear Support Vector Machine • Minimize • Subject to • for each data point (x1, x2, y)where y is the label of the point (+1/-1) 2 y(w1x1 + w2x2 + b) 1
Linear Support Vector Machine • Minimize • Subject to • for each data point (x1, x2, y)where y is the label of the point (+1/-1) Quadratic objective Linear constraints y(w1x1 + w2x2 + b) 1 Quadratic programming
Linear Support Vector Machine • We have just described 2-dimensional space • We can divide the space into two parts by a line • For n-dimensional space where n >=2, • We use a hyperplane to divide the space into two parts
Support Vector Machine • Support Vector Machine (SVM) • Linear Support Vector Machine • Non-linear Support Vector Machine
Non-linear Support Vector Machine • Two Steps • Step 1: Transform the data into a higher dimensional space using a “nonlinear” mapping • Step 2: Use the Linear Support Vector Machine in this high-dimensional space