90 likes | 232 Views
Hamming Clustering: A New Approach to Rule Extraction. Marco Muselli and Diego Liberati In Proceedings of the Third International ICSC Symposia on Soft Computing-SOCO '99, ICSC Academic Press, 499-504. Summarized by HaYoung Jang. Introduction.
E N D
Hamming Clustering: A New Approach to Rule Extraction Marco Muselli and Diego Liberati In Proceedings of the Third International ICSC Symposia on Soft Computing-SOCO '99, ICSC Academic Press, 499-504. Summarized by HaYoung Jang
Introduction • Infer the rules underlying a classification problem starting from a given set of examples. • e.g. character recognition, speech understanding, etc. • Neural networks • Know how to do, but not how to explain.
Introduction • Hamming Clustering • For the solution of classification problems with binary inputs. • Directly infer the explicit rules in if-then form. • Achieve performances comparable to those of artificial neural networks, in terms of both efficiency and efficacy. • Allow a direct implementation on a physical support, since it does not require variable weights with floating point precision.
The procedure of Hamming Clustering • S: a training set containing s samples (xj, yi), j = 1, …, s. • xj : input patterns which have n Boolean components. xji,i = 1, …, n. • x∈ {-1, +1}n
The procedure of Hamming Clustering • Hypercube {-1, +1} • Training set S
The procedure of Hamming Clustering • Minimal Pruning • find a minimum subset of cubes that correctly classify all the input patterns in S. • extracts the clusters with maximum covering one at a time. • Threshold pruning • maintain in C+ and C¡ only the cubes whose covering exceeds a fixed threshold τ. • the value of τis proportional to the maximum covering qmax found in the set of clusters to be pruned.
Tests and Results • Monk’s Problem • Winsconsin Breast Cancer Database
Conclusions • In the reconstruction of Boolean functions HC has been able to achieve a simplified and-or expression, partially recovering the action of noise possibly affecting the aquisition of the input-output pairs. • the capability of HC to build the and-or expression of any Boolean function. • As an important byproduct, HC is able to identify inputs which do not influence the final output, thus automatically reducing the complexity of the given classification problem.