150 likes | 279 Views
Cascade-based Classification Approach to problems with different complexities. Eunelson José da Silva Júnior Alceu S. Britto Jr., Ph.D. Luiz Eduardo S. Oliveira, Ph.D. Graduate Program in Informatics (PPGIa) Pontifical Catholic University of Paraná (PUCPR). INTRODUCTION.
E N D
Cascade-based Classification Approach to problems with different complexities Eunelson José da Silva Júnior Alceu S. Britto Jr., Ph.D. Luiz Eduardo S. Oliveira, Ph.D. Graduate Program in Informatics (PPGIa) Pontifical Catholic University of Paraná (PUCPR)
INTRODUCTION • Ensembles have been used as an alternative to the difficult task of building a monolithic classifier capable of absorbing the whole variability of a classification problem. • With this is mind our search for attaining high classification accuracy may frequently lead us to a more complex systems. • Research question: How to improve classification accuracy without increasing the system complexity?
INTRODUCTION • Alternative: Acascade-based classifier. • Motivation: a better compromise between classification accuracy and the complexity of the classification method. • Improve the classification accuracy • In many classification problems, instances should be rejected when the confidence in their classification is too low to minimize the error rate. • Reduce the complexity • The majority of patterns can be explained by a simple rule. Therefore, they can be classified using a single classifier while just for a few hard cases more sophisticated classifiers are needed.
INTRODUCTION • In this study we propose a two-level cascade classification method combining a monolithic classifier in the first step and an ensemble of classifiers in the second step.
INTRODUCTION • As specific objectives, we want to evaluate: • Different monolithic classifiers • Different methods for generating pools of classifiers • Different methods for classifier selection • The performance of a two-level cascade classification method for problems representing different levels of difficulty. • Hypothesis • Two-level cascade classification method may improve accuracy by properly treating easy and hard patterns. • Higher accuracy considering the error tolerance • Lower computational cost
PROPOSED METHOD • Monolithic classifiers to be evaluated (first step) • K-nearest neighbors (KNN) • Multilayer perceptron (MLP) • Decision tree (J48) • Naive Bayes • Support vector machine (SVM)
PROPOSED METHOD • Multiple classifiers • Ensembles generation techniques • Bagging; Boosting; Random Subspaces • Pools with 10 classifiers for each database • Combination of all classifiers • Majority vote • Dynamic selection of classifiers • DS-LA (LCA and OLA) • KNORA (Eliminate and Union) • Cascade-based classifier • 1st level: best monolithic classifier • 2nd level: different methods based on multiple classifiers
PROPOSED METHOD • Error Tolerance • Define the rejection threshold for each classification method on a validation set to provide an Error <= 1%. • Rejection • Samples that are classified with confidence below the threshold will be rejected
PROPOSED METHOD • Overview
PRELIMINARY RESULTS • 12 databases from the UCI repository
CONCLUSION • The best monolithic classifier in the experiments was the SVM • From 12 classification problems (datasets) • Seven have more than 50% of samples rejected in first level • Thee datasets with no rejection in the first level • One dataset had 100% of samples correctly classified in first level • Pool generation • Most of time the Boosting method achieved the best results for cascade approach
CONCLUSION • Two-level cascade classification method with rejection threshold • Improved up to 48% the accuracy rate when compared with the best monolithic classifier • In second level, was recovered up to 82% of samples rejected by the first level • On average, 49% of instances were correctly classified using only the monolithic classifier of the first level
THANK YOU! QUESTIONS?