130 likes | 248 Views
SUPPORT VECTOR MACHINE BY PARIN SHAH 007332832 . SVM FOR LINEARLY SEPARABLE DATA. Plot the points. Find the margin and support vectors. Find the hyperplane having maximum margin. Based on the computed margin value classify the new input data sets into different categories.
E N D
SUPPORT VECTOR MACHINE BY PARIN SHAH 007332832
SVM FOR LINEARLY SEPARABLE DATA Plot the points. Find the margin and support vectors. Find the hyperplane having maximum margin. Based on the computed margin value classify the new input data sets into different categories.
FIGURE REPRESENTING LINEARLY SEPARABLE DATA Figure representing the support vector and maximum margin hyper plane. (w · x) + b = +1 (positive labels) (w · x) + b = -1 (negative labels) (w · x) + b = 0 (hyperplane) Margin ::
STEPS FOR NON LINEARLY SEPARABLE DATA 1.) Map into feature space. 2.) Use Polynomial kernel Φ(X1) = (X1, X1^2) to map points. 3.) Compute the positive , negative and zero hyperplane. 4.) We get the support vectors and the margin value from it. 5.) Classify the new input values from margin value
KERNEL AND ITS TYPES. • Computation of various points in the feature space can be very costly because feature space can be typically said to be infinite-dimensional. • The kernel function is used for to reduce these cost because the data points appear in dot product and the kernel function are able to compute the inner products of these points. • By kernel function we can directly compute the data points through inner product without explicitly mapping on the feature space.
KERNEL AND ITS TYPES. 1.) Polynomial kernel with degree d. 2.) Radial basis function kernel with width s 3.) Sigmoid with parameter k and q 4.) Linear Kernel K(x,y)= x' * y
SPARSE MATRIX AND SPARSE DATA • Simple data structure of 2-dimensional array storing non-zero values. • Sparse Data iterates over non-zero values only. • Stores the values, row number and column number of non-zero values from the matrix. • Easy to compute the inner product of zeroes. • Speed of SVM algorithms increases by use of Sparse data.
STORING SPARSE DATA • Dictionary of keys (DOK) • DOK represents non-zero values as a dictionary mapping (row, column) tuples to values • List of lists (LIL) • LIL stores one list per row, where each entry stores a column index and value. Typically, these entries are kept sorted by column index for faster lookup. • Coordinate list (COO) • COO stores a list of (row, column, value) tuples. In this the entries are sorted (row index then column index value) to improve random access times. • Yale format
ADVANTAGES OF SVM • In high dimensional spaces Support Vector Machines are very effective. • When number of dimensions is greater than the number of samples in such cases also it is found to be very effective. • Memory Efficient because it uses subset of training points(support vectors) as decisive factors for classification. • Versatile: For different decision function we can define different kernel as long as they provide correct result. Depending upon our requirement we can define our own kernel.
DISADVANTAGES OF SVM • If the number of features is much greater than the number of samples, the method is likely to give poor performances. It is useful for small training samples. • SVMs do not directly provide probability estimates, so these must be calculated using indirect techniques. • We can have Non-traditional data like strings and trees as input to SVM instead of featured vectors. • Should select appropriate kernel for their project according to requirement