280 likes | 484 Views
Sparse Modeling for Finding Representative Objects. See All by Looking at A Few. Ehsan Elhamifar Guillermo Sapiro Ren´e Vidal Johns Hopkins University University of Minnesota Johns Hopkins University. Outline. Introduction Problem Formulation
E N D
Sparse Modeling for Finding Representative Objects See All by Looking at A Few EhsanElhamifar Guillermo SapiroRen´eVidal Johns Hopkins University Universityof Minnesota Johns Hopkins University
Outline • Introduction • Problem Formulation • Geometry of Representatives • Representatives of Subspaces • Practical Considerations and Extensions • Experimental Results
Introduction • Two important problem related to large database • Reducing the Feature-space dimension • Reducing the Object-space dimension
Introduction • Kmeans < Step3: The centriod of each of the k clusters becomes the new mean < Step1: k initial "means" are randomly selected from the data set < Step4: Steps 2 and 3 are repeated until convergence has been reached < Step2: k clusters are created by every observation with the nearest mean
Introduction • Kmedoids • A variant of Kmeans < Step3
Introduction • Rank Revealing QR(RRQR) • a matrix decomposition algorithm based on the QR factorization • T. Chan. Rank revealing qr factorizations. Lin. Alg. and its Appl.,1987. 1
Introduction • Consider the optimization problem • Wish to find at most k << N representatives that best reconstruct the data collection is the coefficient matrix counts the number of nonzero rows of C
Introduction • Applications to video summarization
Problem Formulation • Findingcompactdictionariesto represent data • minimizing the objective function D: the dictionary X: the coefficient matrix
Problem Formulation • Finding Representative Data • Consider a modification to the dictionary learning framework enforce Y: the matrix of data points C: the coefficient matrix : the i-th row of C I( · ) : the indicator function Counts the number of nonzero rows of C
Problem Formulation • This is an NP-hard problem • A standard relaxation of this optimization is obtained as An appropriately chosen parameter
Problem Formulation • Indicates the representatives as the nonzero rows of C
Geometry of Representatives • minimizes the number of representatives that can reconstruct the collection of data points up to an ε error • Set ε = 0
Geometry of Representatives • Theorem • H be the convex hull of the columns of Y • kbe the number of vertices of H the optimal solution : a permutation matrix : the k-dimensional identity matrix : the elements of Δ lie in [0, 1)
Representatives of Subspaces • Assume that the data lie in a union of affine subspaces of • the number of representatives from each subspace is greater than or equal to its dimension coefficient matrix corresponding to data from two subspaces >
Representatives of Subspaces • Theorem • If the data points are drawn from a union of independent subspaces • then the solution of this finds at least dim( ) representatives from each subspace
Practical Considerations and Extensions • Dealing with Outliers • we denote the inliers by Y the outliers by • The solution has the structure
Practical Considerations and Extensions • Dealing with Outliers • Among the rows of the coefficient matrix • the true data • many nonzero elements • Outlier • just one nonzero element
Practical Considerations and Extensions • Define the row-sparsity index of each candidate representative • outliers • the rsivalue is close to 1 • true representative • the rsiis value close to 0
Practical Considerations and Extensions • Dealing with New Observations • Let Y be the collection of points that has already been in the dataset • be the new points that are added to the dataset
Practical Considerations and Extensions • Dealing with New Observations • : the representatives of Y
Experimental Results • Video Summarization • Using Lagrange multipliers
Experimental Results • Investigate the effect of changing the parameter λ
Experimental Results • Classification Using Representatives • Evaluate the performance • Sparse Modeling Representative Selection (SMRS) - proposed algorithm • Kmedoids • Rank Revealing QR (RRQR) • simple random selection of training data (Rand)
Experimental Results • Several standard classification algorithms • Nearest Neighbor (NN) • Nearest Subspace (NS) • Sparse Representation-based Classification (SRC) • Linear Support Vector Machine (SVM)
Experimental Results • run on the USPS digits database and the Extended YaleB face database • USPS digit database • YaleB face database SRC and NS work well when the data in each class lie in a union of low-dimensional subspaces NN often needs to have enough samples from its nearest neighbor
Experimental Results • Outlier Rejection • a dataset of N = 1, 024 images • (1 − ρ) fraction of the images are randomly selected from the Extended YaleB face database • ρ fraction of random images downloaded from the internet
Experimental Results • Outlier Rejection