300 likes | 535 Views
Support Vector Machine (SVM). Presented by Robert Chen. Introduction. High level explanation of SVM SVM is a way to classify data We are interested in text classification. What is a SVM.
E N D
Support Vector Machine(SVM) Presented by Robert Chen
Introduction • High level explanation of SVM • SVM is a way to classify data • We are interested in text classification
What is a SVM • “In essence, an SVM is a mathematical entity, an algorithm (or recipe) for maximizing a particular mathematical function with respect to a given collection of data.” William S Noble
What is a SVM • It is a computer algorithm that learns through the training data we provide in order to categorize new data in future cases. • SVM can’t cluster data, it can only classify data: we use SVD to cluster the data.
SVM hyperplanes • 1)Seperating hyperplane • 1d, 2d, 3d • 2)Maximum-margin hyperplane • Separates classes, while maintaining the maximal distance from any one of the given expression profiles • 3)Soft margin hyperplane • Generalized Optimal hyperplane (name used in Vapnik’s book)
Soft Margin Hyperplane • Allows some outlier data points to push their way through the margin of the separating hyperplane without affecting the final result • “Soft margin parameter specifies a trade-off between hyperplane violations and the size of the margin.” W. Noble
Soft Margin Hyper plane Suggested by Corinna Cortes and Vladimir Vapnik in 1995 Won the 2008 ACM Paris Kanellakis Award
Kernal function • Mathematical solution to determining the hyperplane when: • 1) No clear boundary • 2) Soft margin doesn’t help
Kernal Function • Projects data from a low dimensional state to a high dimensional state • We then project the SVM hyperplane in that state back to a lower drawable state such as 2-D. • Kernals that have a very high-dimension can result in the SVM overfitting the data.
Types of Kernels linear: K(xi, xj) = xiTxj . polynomial: K(xi, xj) = (γ xiT xj + r)d, γ > 0. radial basis function (RBF): K(xi, xj) = exp(−γ |xi − xj|^2), > 0 sigmoid: K(xi, xj) = tanh(γxiTxs + r).
Notes • radial basis function (RBF): K(xi, xj) = exp(−γ |xi − xj|^2), > 0 A radial basis function (rbf) is equivalent to mapping the data into an infinite dimensional Hilbert space
Example • Data Set: 1 dimensional set • Class, X1 • +1, 0 • -1, 1 • -1, 2 • +1, 3 • Φ(X1) = (X1, X1)
Support Vectors • <w · x> + b = +1 (positive labels) (1) • <w · x> + b = -1 (negative labels) (2) • <w · x> + b = 0 (hyperplane) (3) • Any vectors on expressions (1) or (2) are support vectors.
Importance of SVM in Support Vector Machines • Complexity of SVM depends on the number of support vectors rather that on the dimensionality of the feature space
Positive label • w1x1 + w2x2 + b = +1 • w10 + w20 + b = +1 • w13 + w29 + b = +1
Negative label • w11 + w21 + b = -1 • w12 + w24 + b = -1 • w1 = -3, w2 = 1, b = 1
Hyperplane • w1x1 + w2x2 + b = 0 • -3x1 + 1x2 + 1 = 0 • x2 = -1 + 3x1 • X1; X2 0, -1 1, 2 2, 5 3, 8
Maximum-Margin Hyperplane • 2/sqrt( w · w) • 2/sqrt(-32 + 12) margin = 0.632456
Recommended Article • What is a support vector machine? • By William S Noble
Recommended Article • Support Vector Machines for Text Categorization • A. Basu, C. Watters, and M. Shepherd • Faculty of Computer Science • Dalhousie University • Halifax, Nova Scotia, Canada B3H 1W5 • {basu | watters | shepherd@cs.dal.ca}
Recommended Book • The Nature of Statistical Learning Theory • By Vladimir N. Vapnik
Thank you • Questions? • Comments?
Multiclass SVM • Multiclass ranking SVMs, in which one SVM decision function attempts to classify all classes. • One-against-all classification, in which there is one binary SVM for each class to separate members of that class from members of other classes. • Pairwise classification, in which there is one binary SVM for each pair of classes to separate members of one class from members of the other.
Types of Kernels linear: K(xi, xj) = xiTxj . polynomial: K(xi, xj) = (γ xiT xj + r)d, γ > 0. radial basis function (RBF): K(xi, xj) = exp(−γ |xi − xj|^2), > 0 sigmoid: K(xi, xj) = tanh(γxiTxs + r).
Notes • radial basis function (RBF): K(xi, xj) = exp(−γ |xi − xj|^2), > 0 A radial basis function (rbf) is equivalent to mapping the data into an infinite dimensional Hilbert space