140 likes | 159 Views
Neural Net Algorithms for SC Vowel Recognition. Presentation for EE645 Neural Networks and Learning Algorithms Spring 2003 Diana Stojanovic. Summary. Neural net algorithms applied to recognition of Serbo-Croatian vowels Follows Thubthong & Kijsirkul (2001) paper on Thai phoneme recognition
E N D
Neural Net Algorithms for SC Vowel Recognition Presentation for EE645 Neural Networks and Learning Algorithms Spring 2003 Diana Stojanovic
Summary • Neural net algorithms applied to recognition of Serbo-Croatian vowels • Follows Thubthong & Kijsirkul (2001) paper on Thai phoneme recognition • Light background will be provided
Introduction • Speech recognition has many applications (PCs, cell phones, home appliance activation a la Dilbert etc.)
Introduction 2 • There are various algorithms for recognizing speech, some of which rely on the recognition of individual phonemes or sounds
Block diagram of speech recognition system For this project Signal Processing: segmentation, spectral analysis Speech Recognition: Individual vowel recognition Signal Processing Speech Recognition
Previous work • Thubthong & Kijsirkul (2001) tested multi-class Support Vector Machine (SVM) vs. Multilayer Perceptron (MLP) for recognition of Thai Vowels and tones • They claim superiority of SVM, while the recognition rate differs by 2-3% for comparably complex systems
About speech sounds • Speech sound is an acoustic wave • Speaker’s vocal tract shapes the spectrum of each sound • Spectrum depends on the speaker and on the property of the particular sound (for instance /u/), thus recognition in spectral domain is possible
Vowel Formants • Vowels can be recognized in spectral domain by the characteristic “lines” corresponding to their properties (backness, height, lip rounding etc.) • These “lines” –formants- occur at resonant frequencies of the vocal tract
Data Used in the Project Data collection and Properties • Type of speech: speaker dependent, accented syllables • 480 isolated words were recorded and digitized at 11 kHz • Vowels in accented position segmented manually • Vowel formants measured by PCQuirer
Sound Features Measured • Only first two formants were used for training the nets in order to reduce complexity • Based on the property of the SC sounds, the performance should not suffer from this low dimensionality
Perceptron,Backprop and Support Vector Machine • We learned about this throughout the semester . For details, please refer to the paper
What is next? • First, finish the SVM results • Examine fast, connected speech • Speaker independent recognition