COMP5331

COMP5331 Other Classification Models: Support Vector Machine (SVM) Prepared by Raymond Wong Presented by Raymond Wong raywong@cse

What we learnt for Classification • Decision Tree • Bayesian Classifier • Nearest Neighbor Classifier

Other Classification Models • Support Vector Machine (SVM) • Neural Network

Support Vector Machine • Support Vector Machine (SVM) • Linear Support Vector Machine • Non-linear Support Vector Machine

Support Vector Machine • Advantages: • Can be visualized • Accurate when the data is well partitioned

Linear Support Vector Machine x2 w1x1 + w2x2 + b > 0 x1 w1x1 + w2x2 + b = 0 w1x1 + w2x2 + b < 0

Linear Support Vector Machine

Linear Support Vector Machine x2 Support Vector x1 Margin We want to maximize the margin Why?

Linear Support Vector Machine x2 w1x1 + w2x2 + b - D = 0 x1 w1x1 + w2x2 + b = 0 w1x1 + w2x2 + b + D = 0

Linear Support Vector Machine Let y be the label of a point x2 +1 +1 w1x1 + w2x2 + b - 1  0 +1 +1 -1 w1x1 + w2x2 + b - 1 = 0 -1 -1 -1 x1 w1x1 + w2x2 + b + 1  0 w1x1 + w2x2 + b = 0 w1x1 + w2x2 + b + 1 = 0

Linear Support Vector Machine Let y be the label of a point y(w1x1 + w2x2 + b)  1 x2 +1 +1 w1x1 + w2x2 + b - 1  0 +1 +1 -1 w1x1 + w2x2 + b - 1 = 0 -1 -1 y(w1x1 + w2x2 + b)  1 -1 x1 w1x1 + w2x2 + b + 1  0 w1x1 + w2x2 + b = 0 w1x1 + w2x2 + b + 1 = 0

Margin |(b+1) – (b-1)| = 2 = Linear Support Vector Machine Let y be the label of a point y(w1x1 + w2x2 + b)  1 x2 +1 +1 +1 +1 -1 w1x1 + w2x2 + b - 1 = 0 -1 -1 y(w1x1 + w2x2 + b)  1 -1 x1 Margin w1x1 + w2x2 + b + 1 = 0 We want to maximize the margin

2 = Linear Support Vector Machine • Maximize • Subject to • for each data point (x1, x2, y)where y is the label of the point (+1/-1) Margin y(w1x1 + w2x2 + b)  1

Linear Support Vector Machine • Minimize • Subject to • for each data point (x1, x2, y)where y is the label of the point (+1/-1) 2 y(w1x1 + w2x2 + b)  1

Linear Support Vector Machine • Minimize • Subject to • for each data point (x1, x2, y)where y is the label of the point (+1/-1) Quadratic objective Linear constraints y(w1x1 + w2x2 + b)  1 Quadratic programming

Linear Support Vector Machine • We have just described 2-dimensional space • We can divide the space into two parts by a line • For n-dimensional space where n >=2, • We use a hyperplane to divide the space into two parts

Support Vector Machine • Support Vector Machine (SVM) • Linear Support Vector Machine • Non-linear Support Vector Machine

Non-linear Support Vector Machine x2 x1

Non-linear Support Vector Machine • Two Steps • Step 1: Transform the data into a higher dimensional space using a “nonlinear” mapping • Step 2: Use the Linear Support Vector Machine in this high-dimensional space

Non-linear Support Vector Machine x2 x1

COMP5331

COMP5331

Presentation Transcript

COMP5331

COMP5331

COMP5331

COMP5331

COMP5331: Knowledge Discovery and Data Mining

COMP5331

COMP5331