260 likes | 404 Views
Combining Multiple Modes of Information using Unsupervised Neural Classifiers . http://www.computing.surrey.ac.uk/ncg/ Neural Computing Group, Department of Computing, School of Electronics and Physical Sciences, University of Surrey. Khurshid Ahmad, Bogdan Vrusias,. Matthew Casey,
E N D
Combining Multiple Modes of Information using Unsupervised Neural Classifiers http://www.computing.surrey.ac.uk/ncg/ Neural Computing Group, Department of Computing, School of Electronics and Physical Sciences, University of Surrey Khurshid Ahmad, Bogdan Vrusias, Matthew Casey, Panagiotis Saragiotis
Content • Report on preliminary experiments to: • Attempt to improve classification through combining modalities of information • Use a modular co-operative neural network system combining unsupervised learning techniques • Tested using: • Scene-of-crime images and collateral text • Number magnitude and articulation
Background • Consider how we may improve classification through combination: • Combining like classifiers (e.g. ensemble systems) • Combining expert classifiers (e.g. modular systems) • Concentrate on a modular approach to combining modalities of information • For example, Kittler et al (1998): • Personal identity verification using frontal face, face profile and voice inputs
Multi-net Systems • Concept of combining neural network systems has been discussed for a number of years • Both ensemble and modular systems • Ensemble more prevalent • Term multi-net systems has been promoted by Sharkey (1999, 2002) who recently advocated the use of modular systems • For example, mixture-of-experts by Jacobs et al 1991
Multi-net Systems • Neural network techniques for classification tend to subscribe to the supervised learning paradigm • Ensemble methods • Mixture-of-experts • Exceptions include Lawrence et al (1997) and Ahmad et al (2002) • Unsupervised techniques give rise to problems of interpretation
Self-organised Combinations • Our approach is based upon the combination of different Hebbian-like learning systems • Hebb’s neurophysiological postulate (1949) • Strength of connection is increased when both sides of the connection are active
Self-organised Combinations • Willshaw & von der Malsburg (1976) • Used Hebbian learning to associate patterns of activity in a 2-d pre-synaptic (input) layer and a 2-d post-synaptic (output) layer • Pre-synaptic neurons become associated with post-synaptic neurons • Kohonen (1997) extended this in his Self-organising Map (SOM) • Statistical approximation of the input space • Topological map showing relatedness of input patterns • Clusters used to show classes
Self-organised Combinations • Our architecture builds further on this using the multi-net paradigm • Can be compared to Hebb’s superordinate combination of cell assemblies • Two SOMs linked by Hebbian connections • One SOM learns to classify a primary modality of information • One SOM learns to classify a collateral modality of information • Hebbian connections associate patterns of activity in each SOM
Primary Primary Bi - directional Collateral Collateral Vector SOM Hebbian Network SOM Vector Self-organised Combinations • SOMs and Hebbian connections trained synchronously . . . . . .
Self-organised Combinations • Hebbian connections associate neighbourhoods of activity • Not just a one-to-one linear association • Each SOM’s output is formed by a pattern of activity centred on the winning neuron for the primary and collateral input • Training complete when both SOM classifiers have learned to classify their respective inputs
Classifying Images and Text Class Primary Image Collateral Text Body Full length shot of body Nine millimetre browning high power self-loading pistol Single objects (close-up)
Classifying Images and Text • Classify images based upon images and texts • Primary modality of information: • 66 images from the scene-of-crime domain • 112-d vector based upon colour, edges and texture • Collateral modality of information: • 66 texts describing image content • 50-d binary vector term frequency analysis • 8 expert defined classes • 58 vector pairs used for training, 8 for testing
Training • Image SOM: 15 by 15 neurons • Text SOM: 15 by 15 neurons • Initial random weights • Gaussian neighbourhood function with initial radius 8 neurons, reducing to 1 neuron • Exponentially decreasing learning rate, initially 0.9, reducing to 0.1 • Hebbian connection weights normalised • Trained for 1000 epochs
Testing • Tested with 8 image and text vectors • Successful classification if test vector’s winner corresponds with identified cluster for class • Image SOM: • Correctly classified 4 images • Text SOM: • Correctly classified 5 texts
Testing • For misclassified images • Text classification was determined • Translated into image classification via Hebbian activation • Similarly for misclassified texts • Image SOM: • Further 3 images classified out of 4 (total 7 out of 8) • Text SOM: • Further 2 texts classified out of 3 (total 7 out of 8)
Comparison • Contrast with single modality of classification in image or text SOM • Compared with a single SOM classifier • 15 by 15 neurons • Trained on combined image and text vectors (162-d vectors) • 3 out of 8 test vectors correctly classified
Classifying Number • Classify numbers based upon (normalised) image or articulation? • Primary modality of information: • Magnitude representation of the numbers 1 to 22 • 66-d binary vector with 3 bits per magnitude • Collateral modality of information: • Articulation representation of the numbers 1 to 22 • 16-d vector representing phonemes • 22 different numbers to classify • 16 vector pairs used for training, 6 testing
Training • Magnitude SOM: 66 by 1 neurons • Articulation SOM: 16 by 16 neurons • Initial random weights • Gaussian neighbourhood function with initial radius 33 (primary) and 8 (collateral) neurons, reducing to 1 neuron • Exponentially decreasing learning rate, initially 0.5 • Hebbian connection weights normalised • Trained for 1000 epochs
Testing • Tested with 6 magnitude and articulation vectors • Successful classification if test vector’s winner corresponds with identified cluster for class • Magnitude SOM: • Correctly classified 6 magnitudes • Magnitudes arranged in a ‘number line’ • Articulation SOM: • Similar phonetic responses, but essentially misclassified all 6 articulations
Testing • For misclassified articulation vectors • Magnitude classification was determined • Translated into articulation classification via Hebbian activation • Articulation SOM: • 3 articulation vectors classified out of 6 • Remaining 3 demonstrate that Hebbian association not sufficient to give rise to better classification
Comparison • Contrast with single modality of classification in magnitude or articulation SOM • Compared with a single SOM classifier • 16 by 16 neurons • Trained on combined magnitude and articulation vectors (82-d vectors) • Misclassified all 6 articulation vectors • SOM shows test numbers are similar in ‘sound’ to numbers in the training set • Combined SOM does not demonstrate ‘number line’ and cannot capitalise upon it
Summary • Preliminary results show that: • Modular co-operative multi-net system using unsupervised learning techniques can improve classification with multiple modalities • Hebb’s superordinate combination of cell assemblies? • Future work: • Evaluate against larger sets of data • Further understanding of clustering and classification in SOMs • Further explore linkage of neighbourhoods, more than just a one-to-one mapping, and theory underlying model
Acknowledgements • Supported by the EPSRC Scene of Crime Information System project (Grant No.GR/M89041) • University of Sheffield • University of Surrey • Five UK police forces • Images supplied by the UK Police Training College at Hendon, with text transcribed by Chris Handy
References Ahmad, K., Casey, M.C. & Bale, T. (2002). Connectionist Simulation of Quantification Skills. Connection Science, vol. 14(3), pp. 165-201. Jacobs, R.A., Jordan, M.I. & Barto, A.G. (1991). Task Decomposition through Competition in a Modular Connectionist Architecture: The What and Where Vision Tasks. Cognitive Science, vol. 15, pp. 219-250. Hebb, D.O. (1949). The Organization of Behavior: A Neuropsychological Theory. New York: John Wiley & Sons. Kittler, J., Hatef, M., Duin, R.P.W. & Matas, J. (1998). On Combining Classifiers. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 20(3), pp. 226-239. Kohonen, T. (1997). Self-Organizing Maps, 2nd Ed. Berlin, Heidelberg, New York: Springer-Verlag. Lawrence, S., Giles, C.L., Ah Chung Tsoi & Back, A.D. (1997). Face Recognition: A Convolutional Neural Network Approach. IEEE Transactions on Neural Networks, vol. 8(1), pp. 98-113. Sharkey, A.J.C. (1999). Combining Artificial Neural Nets: Ensemble and Modular Multi-Net Systems. Berlin, Heidelberg, New York: Springer-Verlag. Sharkey, A.J.C. (2002). Types of Multinet System. In Roli, F. & Kittler, J. (Ed), Proceedings of the Third International Workshop on Multiple Classifier Systems (MCS 2002), pp. 108-117. Berlin, Heidelberg, New York: Springer-Verlag. Willshaw, D.J. & von der Malsburg, C. (1976). How Patterned Neural Connections can be set up by Self-Organization. Proceedings of the Royal Society, Series B, vol. 194, pp. 431-445.
Combining Multiple Modes of Information using Unsupervised Neural Classifiers http://www.computing.surrey.ac.uk/ncg/ Neural Computing Group, Department of Computing, School of Electronics and Physical Sciences, University of Surrey Khurshid Ahmad, Bogdan Vrusias, Matthew Casey, Panagiotis Saragiotis
Multi-net Systems Sharkey (2002) – Types of Multi-net System