A Comparative Study of Kernel Methods for Classification Applications

A Comparative Study of Kernel Methods for Classification Applications Yan Liu Sep 23, 2003

Introduction • Support Vector Machines • Text classification • Protein classification • Various kernels • Standard kernels • Linear kernels, polynomial kernels, RBF kernels • Other application-oriented kernels • Fisher-kernels, String kernels and etc

Problem Definition • There has been little study focusing on the behaviors of different kernels for: • Rare-class problem (unbalanced data) • Noisy data problem • Multi-label problem • These problems are common in the real applications: • Text classification • Protein Family classification

Text Classification • Kernel selection • Linear kernels • String kernels • Problem Focus • Rare-class problem • Multi-class problem • Dataset • Reuters21578 dataset

Protein Family Classification • Kernel selection • Linear kernels • String kernels • Fisher-kernels • Problem Focus • Rare-class problem • Noisy data problem • Dataset • GPCR classification dataset

Methodology and Schedule • Propose conjectures on the possible behaviors according to analysis • Sep 12th ~ Sep 28th • Work on synthetic datasets to testify hypothesis • Sep 28th ~ Oct 20th • Map from synthetic data to real application data • Oct 20th ~ Sep 18th

Mid-course Deliverables • Analysis of the dataset • Class distribution (rare-class and multi-class) • Noise level • Conjectures for possible behaviors • Results on synthetic datasets • Explanation and interesting observations from the results

Multi-label Problem for Text Classification • Related work • Binary classification (one-vs-all) (by Yang; Joachims) • Mixture Model by EM (by McCallum) • Rank-based approach • Boosting (by Schapire & Singer) • Rank-based kernels (by Elsseeff & Weston)

Multi-label Problem for Text Classification • Possible Solutions • Combine Mixture Model and Kernel-based approach using Fisher-kernels • Similar idea as using HMM and SVM together for protein classification

A Comparative Study of Kernel Methods for Classification Applications

A Comparative Study of Kernel Methods for Classification Applications

Presentation Transcript

Linear Methods for Classification

Overview of Kernel Methods

Kernel Methods: Basics

Online Multiple Kernel Classification

Kernel Methods

Kernel Methods

Kernel methods

Classification ( SVMs / Kernel method)

Kernel Methods for Relation Extraction

Machine Learning for Protein Classification: Kernel Methods

Kernel synchronization methods

Kernel – Based Methods

Kernel Methods

Preliminary Results of a Comparative Study

Comparative Study of Three Methods of Calculating Atomic Charge in a Molecule

Kernel Methods

Comparative Study of Two Methods for Olfactory Measurement

A Comparative UI Study for MyPlace

Kernel Methods

Kernel methods - overview

Linear Methods for Classification

Kernel Methods for Classification From Theory to Practice