320 likes | 437 Views
An Introduction to Support Vector Machines. Presenter: Celina Xia University of Nottingham. Outline. Maximizing the Margin Linear SVM and Linear Separable Case Primal Optimization Problem Dual Optimization Problem Non-Separable Case Non-Linear Case Kernel Functions Applications.
E N D
An Introduction to Support Vector Machines Presenter: Celina Xia University of Nottingham
Outline • Maximizing the Margin • Linear SVM and Linear Separable Case • Primal Optimization Problem • Dual Optimization Problem • Non-Separable Case • Non-Linear Case • Kernel Functions • Applications
Margin Any of these separating lines would be fine.. ..but which is best?
Margin Margin: the width that the boundary could be increased by before hitting a datapoint. margin margin Wide margin Narrow margin Decisionboundary
SVMs reckon… Decision boundary The decision boundary with maximal margin deliver the best generalization ability. margin w Orientation of the decision boundary
2 maximize w w 2 minimize 2 SVM—Linear Separable • Objective: • maximize the margin wTx+b=1 wTx+b=0 wTx+b=-1
2 maximize w w 2 minimize 2 SVM—Linear Separable • Objective: • maximize the margin wTx+b=1 wTx+b=0 wTx+b=-1 Support Vectors
The Lagrangian trick Moving the constraint to objective function Lagrangian:
The Lagrangian trick Optimality conditons:
The Lagrangian trick Replace with Solving:
SVM—Linear Separable Lagrangian: Optimality conditons:
Problems with linear SVM What if the decison function is not a linear?
Kernel Functions • A kernel function K enables the explicit mapping of input data without exact knowledge of • Gaussian radial basis function (RBF) is one of widely-used kernel functions
Dual Optimization Problem replace the dot product of the inputs with the kernel function
Some kernel functions • Polynomial type: • Polynomial type: • Gaussian radial basis function (RBF) • Multi-Layer Perceptron:
Two-Spiral Pattern Given 194 training data points on X-Y plane: 97 of class “ red circle’’ and another 97 of class “blue cross ’’.Question: how to distinguish between these two spirals ?
What’s the challenge? A proper learning of these 194 training data points A piece of cake for a variety of methods. After all, it’s just a limited number of 194 points Correct assignment of an arbitrary data point on XY plane to the right “spiral stripe” Very challenging since there are an infinite number of points on XY-plane, making it the touchstone of the power of a classification algorithm
AN INTRODUCTION TO SUPPORT VECTOR MACHINES(and other kernel-based learning methods)N. Cristianini and J. Shawe-TaylorCambridge University Press2000 ISBN: 0 521 78019 5 References http://www.kernel-machines.org/ http://www.support-vector.net/ Papers by Vapnik C.J.C. Burges: A tutorial on Support Vector Machines. Data Mining and Knowledge Discovery 2:121-167, 1998.