Concave Minimization for Support Vector Machine Classifiers

Concave Minimization for Support Vector Machine Classifiers Unlabeled Data Classification&Data Selection Glenn Fung O. L. Mangasarian

Part 1: Unlabeled Data Classification • Given a large unlabeled dataset • Use a k-Median clustering algorithm to select a small (5% to 10%) representative sample. • Representative sample is labeled by expert or oracle. • Combined labeled-unlabeled dataset is classified by a Semi-supervised Support Vector Machine. • Test set correctness within 5.2% of a linear support vector machine trained on the entire dataset labeled by an expert.

Part 2: Data Selection for Support Vector Machines Classifiers • Extract a minimal set of data points from a given dataset. • Minimal set used to generate a Minimal Support Vector Machine (MSVM) classifier. • MSVM classifier as good or better than that obtained by training on entire dataset. • Feature selection is incorporated into procedure to obtain a minimal set of input features. • Data reduction as high as 81% and averaged 66% over seven public datasets.

SVM: Linear Support Vector Machine

1-norm Linear SVM

Unlabeled Data Classification • Given a completely unlabeled large data set. • Costly to label points by an expert or an oracle. • Two Question arise: • How to choose a small subset for labeling? • How to combine labeled and unlabeled data? • Answers: • Use k-median clustering for selecting “representative” points to be labeled. • Use semi-supervised SVM to obtain a classifier based on labeled and unlabeled data.

Unlabeled Data Classification Unlabeled Data Set k-Median clustering Chosen Data Remaining Data Expert Labeled Data Semi-supervised SVM Separating Plane

K-Median Clustering Algorithm • Given m data points. Find k clusters of these points such that the sum of the 1-norm distances from each point to the closest cluster center is minimized.

* * * * K-Median Clustering Algorithm

K-Median Clustering Algorithm

Unlabeled Data Classification Unlabeled Data Set k-Median clustering Chosen Data Remaining Data Expert Labeled Data Semi-supervised SVM Separating Plane

Semi-supervised SVM (S3VM) • Given a dataset consisting of: • labeled (+1,-1) points represented by: • unlabeled points represented by: • Classify the data into two classes as follows: • Assign each unlabeled point in to a class (+1,-1) so as to maximize the distance between the bounding planes obtained by a linear SVM1 applied to entire dataset.

Formulation

:A concave approach • The term in the objective function is concave because it is the minimum of two linear functions. • A local solution to this problem is obtained solving a succession of linear programs (4 to 7) .

S3VM: Graphical ExampleSeparate Triangles & Circles Hollow shapes represent labeled data Solid shapes represent unlabeled data SVM S3VM

Numerical Tests

Part 2: Data Selection for Support Vector Machines Classifiers Labeled dataset 1-norm SVM feature selection Smaller dimension dataset Support vector suppression MSVM Separating surface

Support Vectors

Feature Selection using 1-norm Linear SVM ( small.)

Motivation for the Minimal Support Vector Machine (MSVM)

Motivation for the Minimal Support Vector Machine (MSVM) • Suppression of error term y: • Minimizes the number of misclassified points. • Works remarkably well computationally. • Reduces positive components of multiplier u and hence number of support vectors.

MSVM Formulation

Numerical Tests

Conclusions • Unlabeled data classification: • A fast finite linear programming based approach for Semi-supervised Support Vector Machines was proposed for classifying large datasets that are mostly unlabeled. • Totally unlabeled datasets were classified by: • Labeling a small percentage of clusters by an expert • Classification by a semi-supervised SVM • Test set correctness within 5.2% of a linear SVM trained on the entire dataset labeled by an expert.

Conclusions • Data selection for SVM classifiers: • Minimal SVM (MSVM) extracts a minimal subset used to classify the entire dataset. • MSVM maintains or improves generalization over other classifiers that use the entire dataset. • Data reduction as high as 81%, and averaged 66% over seven public datasets. • Future work • MSVM: Promising tool for incremental algorithms. • Improve chunking algorithms with MSVM. • Nonlinear MSVM: strong potential for time & storage reduction.

Concave Minimization for Support Vector Machine Classifiers

Concave Minimization for Support Vector Machine Classifiers

Presentation Transcript

Support Vector Machine

Support vector machine

Support vector machine

Support vector machine

Support Vector Machine

Support Vector Machine

Support Vector Machine

Support Vector Machine

Support Vector classifiers for Land Cover Classification

Support Vector Machine (SVM)

Support Vector Machine

Support Vector Machine

Support Vector Machine

Support Vector Machine Classifiers

Classification: Support Vector Machine

Support Vector Machine

Support Vector Machine (SVM)

Support Vector Machine

Support Vector Machine