20 likes | 171 Views
1. 1. 1. 1. 2. Feature redundancy based on logical coverage In classification rule learning, good features discriminate one class from another. Redundant features are those which are covered by another, better discriminating, feature.
E N D
1 1 1 1 2 Feature redundancy based on logical coverage In classification rule learning, good features discriminate one class from another. Redundant features are those which are covered by another, better discriminating, feature. Discriminating a from b: f2 covers f1 and f3, so f2 is the only non-redundant feature for discriminating a from b. 1 5 3 4 Main theoretical results REFER preserves the learnability of a complete and consistent hypothesis in the reduced data. REFER’s time complexity is linear in the number of examples and quadratic in the number of features. Construction of the first neighbourhood. Comparison with feature selection methods REFER was compared with LVF, CFS and Relief over 13 preprocessed UCI datasets. REFER was generally conservative of the original features but faster on most datasets: REFER was fastest on 8 out of 13 and close on 3. Predictive accuracy on the reduced data was competitive: Results on propositionalised mutagenesis datasets Up to 99.5% of features eliminated, while showing competitive accuracy on reduced data (85%) Redundant Feature Elimination for Multi-Class Problems Annalisa Appice, Michelangelo Ceci Dipartimento di Informatica, Università degli Studi di Bari, Italy Simon Rawles, Peter Flach Department of Computer Science, University of Bristol, UK Extending to the multiclass setting with neighbourhoods Each class is partitioned into subsets of similar examples — neighbourhoods. This decomposition into neighbourhoods leads to the discovery of a smaller feature set. Neighbourhood pairs are compared in turn to collect non-redundant features. REFER is an efficient, scalable, logic-based algorithm for eliminating Boolean features which are redundant for multi-class classifier learning.