MLP speech enabled systems applications

AI: Neural Networks lecture 4Tony AllenSchool of Computing & InformaticsNottingham Trent University

MLP speech enabled systems applications • M. Nakamura, K. Maruyama, T. Kawabata & K. Shikanu; “Neural Network Approach to Word Category Prediction for English Texts”; Procs. of the Int. Conference on Computational Linguistics, 1990, Helsinki, pp. 213–218. • H. Schmid, “Part-of-Speech Tagging with Neural Networks”, Procs. of the Int. Conference on Computational Linguistics, 1994, pp. 172–176. • Norman Poh & Jerzy Korczak; “Hybrid Biometric Person Authentication Using Face and Voice Features”; Proceedings of 3rd Int. Conference, Audio- and Video-Based Biometric Person Authentication AVBPA 2001, Sweden, pp 348-353, June 2001.

Part-of-Speech Tag Prediction In Bigram neural network predictor system, input vector is (one of 89 bit) POS tag for previous word. Output vector is (one of 89 bit) POS tag for next word.

Part-of-Speech Tag Prediction: N-Gram

Part-of-Speech Tag Prediction: Results • Prediction accuracy increases as more of the output neurons are included within the output classification

Part-of-Speech Tag Prediction: Application • Neural Network predictor used to improve speech recognition results by 6%. • HMM recogniser produces 10 potential word/tag candidates for each recognition event. Netgram predictor output used to select best word/tag candidate.

Part-of-Speech Tag Disambiguation • Each output node corresponds to one of the tags in the tagset. • Input vector comprises the POS probabilities of the current word and 2 future words plus the disambiguated tags of 3 preceeding words. • POS probabilities obtained from look-up table

Part-of-Speech Tag Disambiguation: Results • Network trained on 2 million subpart of PennTreebank corpus over 4 million epochs. • Networks tested on 100000 word subpart which was not part of training set.

Biometric user authentication 10 moments extracted from HSI colour eye images automatically located in face images using histogram analysis, round mask convolution and peak-searching algorithm. 64 Morlet wavelet coefficients extracted from 3 seconds worth of speech input.

Biometric user authentication: Results 2 mlps (one for face and one for voice) are trained for each of 30 people. Outputs of each pair of mlps ANDED together to give verification output for each person.

MLP speech enabled systems applications

MLP speech enabled systems applications

Presentation Transcript

Implementing MLP

Mesh-Enabled Web Applications

Carrier Ethernet - Enabled Applications

10 Innovative Speech Applications

Database Applications and Web-Enabled Databases

Ajax-Enabled Rich Internet Applications

Internet-Enabled Applications

Designing API-Enabled Applications

Creating Network-Enabled Applications

Web-Enabled Decision Support Systems

Grid-Enabled Geospatial Systems

MLP Analysis

Human Factors and VII Enabled Applications

Web-Enabled Decision Support Systems

Database Applications and Web-Enabled Databases

PKI-Enabled Applications That work!

Speech-Enabled .NET Framework Application for CIMS

Speech Recognition Technology Applications

Operational applications enabled by ADS-B

Directory Enabled Applications Succinctly

Ajax-Enabled Rich Internet Applications

Operational applications enabled by ADS-B