200 likes | 348 Views
An Ensemble of Classifiers Approach for the Missing Feature Problem Using learn ++. IEEE Region 2 Student Paper Contest University of Maryland Eastern Shore April 5 th , 2003 Stefan Krause Rowan University. Project Advisor: Dr. Robi Polikar Branch Counselor: Dr. Shreekanth Mandayam.
E N D
An Ensemble of Classifiers Approach for the Missing Feature Problem Using learn++ IEEE Region 2 Student Paper Contest University of Maryland Eastern Shore April 5th, 2003 Stefan Krause Rowan University Project Advisor: Dr. Robi Polikar Branch Counselor: Dr. Shreekanth Mandayam This material is based upon work supported by the National Science Foundation under Grant No ECS-0239090. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
Overview • Background • Problem Definition • Motivation • Approach and Theory • Databases and Results • Conclusions • References • Questions
Background 2 222 222 222 Background Problem Definition Motivation Approach and Theory Databases and Results Conclusion References Questions Pattern recognition • Recognizing and classifying a previously seen / familiar pattern 0 1 2 3 4 5 6 7 8 9 A classifier is necessary for automated machine recognition of patterns
Background C0 f1 f2 f3 C1 f1 f2 f3 C2 …… …… … f63 f64 f63 f64 C8 C9 Background Problem Definition Motivation Approach and Theory Databases and Results Conclusions References Questions Artificial neural network • An artificial neural network (ANN) is an algorithmicmodel of the brain, albeit very crude, to allow a computer to emulate the brain’s decision making capability 2
Problem Definition C0 f1 f2 f3 C1 f1 f2 f3 C2 …… …… … f63 f64 f63 f64 C8 C9 Background Problem Definition Motivation Approach and Theory Databases and Results Conclusions References Questions The missing feature problem • The missing feature problem occurs when instances from a data set have features that are missing or corrupted 2 ?
Motivation The missing feature problem is a significant issue in computational and machine learning because: Background Problem Definition Motivation Approach and Theory Databases and Results Conclusions References Questions • Neural networks can only produce a valid classification when all features used for creating the network are available. • Sensor failure / malfunction or corrupt data is very common in sensor based applications where multiple sensors are observing an event. • Solving the missing feature problem adds considerable robustness to a data classification algorithm.
Approach and Theory Background Problem Definition Motivation Approach and Theory Databases and Results Conclusions References Questions • Learn++ automated classification algorithm • Ensemble based incremental learning • Modified for the missing feature problem
Approach and Theory classifier 1 classifier 2 classifier 3 classifier 4 Complex decision boundary to be learned Background Problem Definition Motivation Approach and Theory Databases and Results Conclusions References Questions O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O X O O O O X O O O O O X O X O X X O X O O O O O X X X O X X O O O X X O O X O O X X X X O O X O O O X O O X O O X X O O O X X X O O X X O X O X O O X X X O O X O X X X O X O O O X X O X O X O O X X O X X O O O O O O O O O O O O O O O O O O O O O O
Approach and Theory Background Problem Definition Motivation Approach and Theory Databases and Results Conclusions References Questions Traditional ensemble of classifiers approach
Approach and Theory Background Problem Definition Motivation Approach and Theory Databases and Results Conclusions References Questions Creating networks in the ensemble with only some features
Approach and Theory Background Problem Definition Motivation Approach and Theory Databases and Results Conclusions References Questions Classifying an instance that is missing f2
Databases and Results Gas Identification Database Identification of 5 volatile organic compounds using 6 quartz crystal microbalance sensors. Background Problem Definition Motivation Approach and Theory Databases and Results Conclusions References Questions
Databases and Results Background Problem Definition Motivation Approach and Theory Databases and Results Conclusions References Questions Gas Identification Database
Databases and Results Optical Character Recognition Database Identification of handwritten characters of the numbers 0 through 9. Background Problem Definition Motivation Approach and Theory Databases and Results Conclusions References Questions
Databases and Results Background Problem Definition Motivation Approach and Theory Databases and Results Conclusions References Questions Optical Character Recognition Database
Databases and Results Ionosphere Radar Return Database This system consists of a phased array of 16 high-frequency antennas with a total transmitted power on the order of 6.4 kilowatts. The targets were free electrons in the ionosphere. Background Problem Definition Motivation Approach and Theory Databases and Results Conclusions References Questions
Databases and Results Background Problem Definition Motivation Approach and Theory Databases and Results Conclusions References Questions Ionosphere Radar Return Database
Conclusions Background Problem Definition Motivation Approach and Theory Databases and Results Conclusions References Questions • Initial results indicate that the algorithm is capable of classifying data, even with up to 10% missing features, with virtually no drop off in performance. • The mathematical equations for the algorithm as well as a flow chart describing the algorithm can be found in the paper.
References R. Polikar, L. Udpa, S. Udpa, and V. Honavar, “Learn++: an incremental learning algorithm for supervised neural networks,” IEEE Tran. Systems, Man and Cybernetics, C, vol. 31, no. 4, pp. 497-508, 2001. R. Polikar, J. Byorick, S. Krause, A. Marino and M. Moreton, “Learn++: A Classifier Independent Incremental Learning Algorithm for Supervised Neural Networks,” Proc. Int. Joint Conf. Neural Networks (IJCNN2002), vol. 2 , pp. 1742-1747, Honolulu, HI, 2002. L.K. Hansen and P. Salamon, “Neural network ensembles,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 12, no. 10, pp. 993-1001, 1990. Y. Freund and R. Schapire, “A decision theoretic generalization of on-line learning and an application to boosting,” Computer and System Sciences, vol. 57, no. 1, pp. 119-139, 1997 C.L. Blake and C.J. Merz, UCI Repository of machine learning databases at http://www.ics.uci.edu/~mlearn/ MLRepository.html. Irvine, CA: University of California, Dept. of In-formation and Computer Science, 1998. R. Polikar, R. Shinar, L. Udpa, M. Porter, “Artificial intelligence Methods for Selection of an Optimized Sensor Array for Identification of Volatile Organic Compounds,” Sensors and Actuators B: Chemical, Volume 80, Issue 3, pp 243-254, December 2001. Background Problem Definition Motivation Approach and Theory Databases and Results Conclusions References Questions
Questions Background Problem Definition Motivation Approach and Theory Databases and Results Conclusions References Questions This presentation and the paper are available online at: http://engineering.rowan.edu/~polikar/RESEARCH/PUBLICATIONS/publications.html