230 likes | 422 Views
Proximal Plane Classification KDD 2001 San Francisco August 26-29, 2001. Glenn Fung & Olvi Mangasarian. Data Mining Institute University of Wisconsin - Madison. Second Annual Review June 1, 2001. Key Contributions. Fast new support vector machine classifier
E N D
Proximal Plane ClassificationKDD 2001San Francisco August 26-29, 2001 Glenn Fung & Olvi Mangasarian Data Mining Institute University of Wisconsin - Madison Second Annual Review June 1, 2001
Key Contributions • Fast new support vector machine classifier • An order of magnitude faster than standard classifiers • Extremely simple to implement • 4 lines of MATLAB code • NO optimization packages (LP,QP) needed
Outline of Talk • (Standard) Support vector machine (SVM) classifiers • Proximal support vector machines (PSVM) classifiers • Geometric motivation • Linear PSVM classifier • Nonlinear PSVM classifier • Full and reduced kernels • Numerical results • Correctness comparable to standard SVM • Much faster classification! • 2-million points in 10-space in 21 seconds • Compared to over 10 minutes for standard SVM
Support Vector MachinesMaximizing the Margin between Bounding Planes A+ A-
Proximal Vector MachinesFitting the Data using two parallel Bounding Planes A+ A-
Changing to 2-norm and measuring margin in( )space: min (QP) s. t. At the solution of (QP) : , where Hence (QP) is equivalent to : min SVM as an Unconstrained Minimization Problem
min (QP) s. t. Solving for in terms of and gives: min PSVM Formulation We have from the QP SVM formulation: This simple, but critical modification, changes the nature of the optimization problem tremendously!!
Advantages of New Formulation • Objective function remains strongly convex • An explicit exact solution can be written in terms of the problem data • PSVM classifier is obtained by solving a single system of linear equations in the usually small dimensional input space • Exact leave-one-out-correctness can be obtained in terms of problem data
We want to solve: min Linear PSVM • Setting the gradient equal to zero, gives a nonsingular system of linear equations. • Solution of the system gives the desired PSVM classifier
Here, • The linear system to solve depends on: which is of the size is usually much smaller than Linear PSVM Solution
Input Define Calculate Solve Classifier: Linear Proximal SVM Algorithm
Linear PSVM: (Linear separating surface: ) : min (QP) s. t. . Maximizing the margin By QP “duality”, in the “dual space” , gives: min min • Replace by a nonlinear kernel Nonlinear PSVM Formulation
The nonlinear classifier: : • Polynomial Kernel : • Gaussian (Radial Basis) Kernel The Nonlinear Classifier • Where K is a nonlinear kernel, e.g.:
Similar to the linear case, setting the gradient equal to zero, we obtain: Defining slightly different: • Here, the linear system to solve is of the size Nonlinear PSVM However, reduced kernels techniques can be used (RSVM) to reduce dimensionality.
Input Define Calculate Classifier: Classifier: Linear Proximal SVM Algorithm Non Solve
PSVM MATLAB Code function [w, gamma] = psvm(A,d,nu)% PSVM: linear and nonlinear classification % INPUT: A, d=diag(D), nu. OUTPUT: w, gamma% [w, gamma] = pvm(A,d,nu); [m,n]=size(A);e=ones(m,1);H=[A -e]; v=(d’*H)’ %v=H’*D*e; r=(speye(n+1)/nu+H’*H)\v % solve (I/nu+H’*H)r=v w=r(1:n);gamma=r(n+1); % getting w,gamma from r
Linear PSVM Comparisons with Other SVMsMuch Faster, Comparable Correctness
Linear PSVMComparisons on Larger Adult Dataset Much Faster & Comparable Correctness
Linear PSVM vs LSVM 2-Million Dataset Over 30 Times Faster
Nonlinear PSVM Comparisons * A rectangular kernel was used of size 8124 x 215
Conclusion • PSVM is an extremely simple procedure for generating linear and nonlinear classifiers • PSVM classifier is obtained by solving a single system of linear equations in the usually small dimensional input space for a linear classifier • Comparable test set correctness to standard SVM • Much faster than standard SVMs : typically an order of magnitude less.
Future Work • Extension of PSVM to multicategory classification • Massive data classification using an incremental PSVM • Parallel extension and implementation of PSVM