120 likes | 298 Views
SVM Implementation. Zhenshan, Wen wenzs@pku.edu.cn. Support Vector Machine (SVM). Name: Support Vector Machine (SVM)
E N D
SVM Implementation Zhenshan, Wen wenzs@pku.edu.cn
Support Vector Machine (SVM) • Name: Support Vector Machine (SVM) • Links: SVMs at www.kernel-machines.orgSVMs at Bell LabsSVMs at Microsoft ResearchSVMs at Royal Holloway CollegeLearning Systms Group ANUSVMs at the MITSVMs at the Bristol CI-Group • Publications: • Burges, C. (1998) A Tutorial on Support Vector Machines for Pattern Recognition • Vapnik, Vladimir (1998): Statistical Learning Theory • Systems: SVM light, mySVM • HypothesisLanguage: Functions • Tasks: Concept Learning ,Function Approximation
LibSVM • LIBSVM is an integrated software • Support vector classification • C-SVC • nu-SVC • Support vector regression • one-class SVM • epsilon-SVR • nu-SVR • Also supports multi-class classification • Basic algorithm is a simplification stemmed from • SMO, by Platt • SVMLight, by Joachims • modification 2 of SMO, by Keerthi et al
LibSVM • Main features of LIBSVM include : • Different SVM formulations • Efficient multi-class classification • Cross validation for model selection • Weighted SVM for unbalanced data • Both C++ and Java sources • GUI demonstrating SVM classification and regression • Python, R (also Splus), Matlab, Perl, and Ruby interfaces • Automatic model selection which can generate contour of cross valiation accuracy
svm_train Usage • Usage: svm_train [options] training_set_file [model_file], options: • -s svm_type : set type of SVM (default 0) • 0 -- C-SVC • 1 -- nu-SVC • 2 -- one-class SVM • 3 -- epsilon-SVR • 4 -- nu-SVR • -t kernel_type : set type of kernel function (default 2) • 0 -- linear: u'*v • 1 -- polynomial: (gamma*u'*v + coef0)^degree • 2 -- radial basis function: exp(-gamma*|u-v|^2) • 3 -- sigmoid: tanh(gamma*u'*v + coef0) • -d degree : set degree in kernel function (default 3) • -g gamma : set gamma in kernel function (default 1/k) • -r coef0 : set coef0 in kernel function (default 0) • -c cost : set the parameter C of C-SVC, epsilon-SVR, and nu-SVR (default 1) • -n nu : set the parameter nu of nu-SVC, one-class SVM, and nu-SVR (default 0.5) • -p epsilon : set the epsilon in loss function of epsilon-SVR (default 0.1) • -m cachesize : set cache memory size in MB (default 40) • -e epsilon : set tolerance of termination criterion (default 0.001) • -h shrinking: whether to use the shrinking heuristics, 0 or 1 (default 1) • -wi weight: set the parameter C of class i to weight*C in C-SVC (default 1) • -v n: n-fold cross validation mode
svm_predict Usage • Usage: svm-predict test_file model_file output_file • model_file is the model file generated by svm-train. • test_file is the test data you want to predict. • svm-predict will produce output in the output_file. • Tips on practical use • ===================== • * Scale your data. For example, scale each attribute to [0,1] or [-1,+1]. • * For C-SVC, consider using the model selection tool in the python directory. • * nu in nu-SVC/one-class-SVM/nu-SVR approximates the fraction of training • errors and support vectors. • * If data for classification are unbalanced (e.g. many positive and • few negative), try different penalty parameters C by -wi (see • examples below).
Example (c version) • > svm-train -s 0 -c 1000 -t 1 -g 1 -r 1 -d 3 data_file • Train a classifier with polynomial kernel (u'v+1)^3 and C = 1000 • > svm-train -s 1 -n 0.1 -t 2 -g 0.5 -e 0.00001 data_file • Train a classifier by nu-SVM (nu = 0.1) with RBF kernel exp(-0.5|u-v|^2) and stopping tolerance 0.00001 • > svm-train -s 3 -p 0.1 -t 0 -c 10 data_file • Solve SVM regression with linear kernel u'v and C=10, and epsilon = 0.1 in the loss function. • > svm-train -s 0 -c 10 -w1 1 -w-1 5 data_file • Train a classifier with penalty 10 for class 1 and penalty 50 for class -1. • > svm-train -s 0 -c 500 -g 0.1 -v 5 data_file • Do five-fold cross validation for the classifier using the parameters C = 500 and gamma = 0.1
svm Java Class • Class svm provides some important methods: • svm_train • svm_predict • svm_save_model • svm_load_model • svm_check_parameter
Data Representation • Using the same representation of training data as SVM_light uses. • BNF-like representation • <class> .=. +1 | -1 • <feature> .=. integer (>=1) • <value> .=. real • <line> .=. <class> <feature>:<value><feature>:<value> ... <feature>:<value>
Data Representation Example • Example (SVM) • +1 201:1.2 3148:1.8 3983:1 4882:1 • -1 874:0.3 3652:1.1 3963:1 6179:1 • Example (SVR) • 0.23 201:1.2 3148:1.8 3983:1 4882:1 • 0.33 874:0.3 3652:1.1 3963:1 6179:1