1 / 25

Learning the Kernel Matrix in Discriminant Analysis via QCQP

Learning the Kernel Matrix in Discriminant Analysis via QCQP. Jieping Ye Arizona State University. Joint work with Shuiwang Ji and Jianhui Chen. Kernel Discriminant Analysis. Regularized kernel discriminant analysis (RKDA) performs linear discriminant analysis (LDA) in the feature space.

bary
Download Presentation

Learning the Kernel Matrix in Discriminant Analysis via QCQP

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Learning the Kernel Matrix in Discriminant Analysis via QCQP Jieping Ye Arizona State University Joint work with Shuiwang Ji and Jianhui Chen

  2. Kernel Discriminant Analysis • Regularized kernel discriminant analysis (RKDA) performs linear discriminant analysis (LDA) in the feature space. • The classification performance of RKDA is comparable to that of support vector machines (SVM). • The performance of RKDA depends on the selection (learning) of the kernel. • Cross-validation is commonly applied for kernel selection. • Recent trend:multiple kernel learning (MKL).

  3. Outline • Overview of kernel methods and multiple kernel learning • Binary-class multiple kernel learning • Multi-class multiple kernel learning • Experiments • Conclusions

  4. Data Embed data Linear algorithm SVM, PCA, CCA, LDA… Kernel-based Learning

  5. Data Embed data Linear algorithm SVM, PCA, CCA, LDA… Kernel-based Learning Kernel design Kernel algorithm

  6. j i K Kernel-based Learning Embed data IMPLICITLY: Inner product measures similarity X y Add domain-specific knowledge to measure similarity

  7. ? ? K Learning with Multiple Kernels

  8. Multiple Kernel Learning Given a set of p kernel matrices the optimal kernel matrix G is restricted to be a convex linear combination of these kernel matrices: • Learning criterion: • Margin between two classes in SVM • Lanckriet et al., 2004 • Class discrimination in discriminant analysis

  9. Binary-Class Kernel Discriminant Analysis • RKDA finds the optimal linear hyperplane by maximizing the Fisher discriminant ratio (FDR): Centroids of the positive and negative classes Covariance matrix of the positive and negative classes

  10. SDP Formulation for Binary-Class Kernel Learning • Kim et al. (ICML 2006) formulated MKL for RKDA as the following maximization problem: • This leads to a semidefinite program (SDP).

  11. Proposed Criterion for Binary-Class Kernel Learning • Consider the maximization of the following objective function: • the so-called total scatter matrix in the feature space is defined as follows: • We show that this criterion leads to an efficient Quadratically Constrained Quadratic Programming (QCQP) formulation for multiple kernel learning in the binary-class case. • Most multiple kernel learning algorithms work for binary-class problems only. We show that this QCQP formulation can be naturally extended to the multi-class case.

  12. Least Squares Formulation • Consider the regularized least squares problem, which minimizes the following objective function: • We have the following result:

  13. QCQP Formulation for Binary-Class Kernel Learning

  14. Benefits of Our QCQP Formulation • QCQP can be solved more efficiently than SDP and it is therefore more scalable to large-scale problems. • Similar ideas have been used in Lanckriet et al. (JMLR 2004) to formulate the kernel learning problem of SVM as a nonnegative linear combination of some given kernel matrices. • Most kernel learning formulations are constrained to binary-class problems. Our formulation can be extended naturally to deal with multi-class problems.

  15. Multi-Class Kernel Learning in Discriminant Analysis • The following objective function is maximized:

  16. Least Squares Formulation for the Multi-Class Case • Equivalence relationship (Ye et al., 2007)

  17. QCQP Formulation for Multi-Class Kernel Learning

  18. Experiments • We use MOSEK as the QCQP solver. • http://www.mosek.com • The reported experimental results are averaged over 30 random partitions of the data into training and test sets.

  19. Competing Algorithms • The proposed QCQP formulation is compared with the following algorithms:

  20. Experimental Result on Sonar We can observe that cross-validated RKDA achieves the best performance on kernels corresponding to θ6 and θ7, while cross-validated SVM achieves the highest accuracy on θ6, θ7, and θ8. On the other hand, methods using linear combination of kernels seem to favor kernels corresponding to θ5, θ6, and θ7.

  21. More Result on Binary-Class Data Sets

  22. Running Time Comparison We can observe that the proposed QCQP formulation is much more efficient than the SDP formulation. Results also show that the QCQP formulation is much more efficient than doubly cross-validated RKDA.

  23. Experiments Our QCQP formulation is very competitive with the other two methods based on cross-validation. Compared with the other two methods, the proposed method learns a convex linear combination of kernels by avoiding the cross-validation.

  24. Conclusions • Propose a QCQP formulation for RKDA kernel learning in the binary-class case, which can be naturally extended to the multi-class case. • Multi-class QCQP formulation is still expensive for problems with a large sample size and a large number of classes. • We are currently investigating semi-infinite linear programming to improve the efficiency. • Applications to biological image analysis

  25. Acknowledgments • This research is in part supported by: • Arizona State University • National Science Foundation Grant IIS-0612069

More Related