180 likes | 443 Views
Fuzzy-Rough Instance Selection. Outline. The importance of instance selection Rough set theory Fuzzy-rough sets Fuzzy-rough instance selection Experimentation Conclusion. Instance selection. Knowledge discovery The problem of too much data Requires storage
E N D
Outline • The importance of instance selection • Rough set theory • Fuzzy-rough sets • Fuzzy-rough instance selection • Experimentation • Conclusion
Instance selection • Knowledge discovery • The problem of too much data • Requires storage • Intractable for data mining algorithms • Removing data that is noisy or irrelevant
Rough set theory Upper Approximation Set A Lower Approximation Equivalence class Rx Rx is the set of all points that are indiscerniblewith point x
Fuzzy-rough sets • Approximate equality • Handle real-valued features via fuzzy tolerance relations instead of crisp equivalence • Better noise and uncertainty handling • Focus has been on feature selection, not instance selection
Fuzzy-rough sets • Parameterized relation • Fuzzy-rough definitions:
Instance selection: basic idea Not needed Remove objects to keep the underlying approximations unchanged
Instance selection: basic idea Remove objects to keep the underlying approximations unchanged
Results: FRIS-I (heart) • (214 objects, 9 features)
Conclusion • Proposed new techniques for instance selection based on fuzzy-rough sets • Managed to reduce the number of instances significantly, retaining classification accuracy • Future work • Many possibilities for novel fuzzy-rough instance selection methods • Comparisons with non-rough techniques • Improving the complexity of FRIS-III • Combined instance/feature selection
WEKA implementations of all fuzzy-rough methods can be downloaded from: • http://users.aber.ac.uk/rkj/book/weka.zip