280 likes | 450 Views
Ti5216100 MACHINE VISION. SUPPORT VECTOR MACHINES Ma xim Mikhnevich Pavel Stepanov Pankaj Sharma Ivan Ryzhov Sergey Vlasov 2006-2007. Content. Where Support Vector Machine comes from? Relationship between Machine Vision and Pattern Recognition (the place of SVM in the whole system)
E N D
Ti5216100 MACHINE VISION • SUPPORT VECTOR MACHINES • Maxim Mikhnevich • Pavel Stepanov • Pankaj Sharma • Ivan Ryzhov • Sergey Vlasov • 2006-2007
Content • Where Support Vector Machine comes from? • Relationship between Machine Vision and Pattern Recognition (the place of SVM in the whole system) • Application areas of Support Vector Machines • Classification problem • Linear classifiers • The Non-Separable Case • Kernel-trick • Advantages and disadvantages
Out of the presentation • Lagrange Theorem • Kuhn-Tucker Theorem • Quadratic Programming • We don’t go to deep math
History • The Support Vector Machine (SVM) is a new and very promising classification technique developed by Vapnik and his group at AT&T Bell Labs
Relationship between Machine Vision and Pattern Recognition • Our task during this • presentation is to • show that SVM is one • of the best classifiers
Application Areas (cont.) • Geometrical Interpretation of how the SVM separates the face and non-face classes. The patterns • are real support vectors obtained after training the system. Notice the small number of total support vectors • and the fact that a higher proportion of them correspond to non-faces.
Basic Definitions from technical viewpoint • Feature • Feature space • Hyperplane • Margin
Problem • Binary classification • •Learning collection: • - Vectors x1,…,xn – our documents (objects) • - y1,…,yn {-1,1} • Our goal is to find the optimal hyperplane!
Linear classifiers Maximum margin linear classifier w·xi >b => yi=1 w·xi<b => yi =-1 w·xi - b >= 1 => yi = 1 w·xi - b <= -1 => yi = -1
Linear classifiers (cont.) • (a) - a separating hyperplane with a small margin • (b) - a separating hyperplane with a larger margin • A better generalization capability is expected from (b)!
Margin width • Let’s take two any points from H1 and H2: x+ and x-
Formalization Constraints: • Our aim is to find the widest margin!!! Number of constraints = number of pairs (xi,yi) Optimization criterion:
Noise and Penalties Constraints: Number of constraints = 2 * number of pairs (xi,yi) Optimization criterion: where ei >= 0
First great idea • The idea give us how to find linear classifier: then wider our margin and then sum of errors is smaller then better. • Now we’ve brought our problem of finding linear classifier to Quadratic Programming problem.
How to solve our problem • Construct Lagrangian • Use Kuhn-Tucker Theorem
How to solve our problem • Our solution is:
Second great idea • Chose the mapping to extended space • After that we can find the new function which is called Kernel: • Find the linear margin w, b in extended space • Now we have our hyperplane in initial space
Second great idea - Extend our space • Solution of XOR problem with the help of Support Vector Machines (by increasing of our space dimension) OR different example:
SVM Kernel Functions • K(a,b)=(a . b +1)d is an example of an SVM Kernel Function • Beyond polynomials there are other very high dimensional • basis functions that can be made practical by finding the right Kernel Function • Radial-Basis-style Kernel Function: • Neural-net-style Kernel Function: s, k and d are magic parameters that must be chosen by a model selection method such as CV or VCSRM
References • V. Vapnik, The Nature of Learning Theory, Springer-Verlag, New York, 1995. • http://www.support-vector-machines.org/ • http://en.wikipedia.org/wiki/Support_vector_machine • Pattern Classification (2nd ed.) Richard O. Duda, Peter E. Hart • Support Vector Machines Andrew W. Moore tutorial at http://www.autonlab.org/tutorials/svm.html • B. Schölkopf, C.J.C. Burges, and A.J. Smola, Advances in Kernel Methods—Support Vector Learning, to appear, MIT Press, Cambridge, Mass, 1998.