140 likes | 349 Views
Acoustic Modeling. Jacob Zurasky ECE5526 – Spring 2011. Project Goals. Generate an Acoustic Model for digit recognition Create a MATLAB tool to transcribe training data to be used in model generation
E N D
Acoustic Modeling Jacob Zurasky ECE5526 – Spring 2011
Project Goals • Generate an Acoustic Model for digit recognition • Create a MATLAB tool to transcribe training data to be used in model generation • Create a MATLAB tool to generate an Acoustic Model based on the training data and transcriptions
What is an Acoustic Model? • A statistical representation of sounds that make up a word. • Created from feature vectors that are extracted from the training data
Feature Vectors • For each window of speech (~10ms), compress speech data to 39 MFCC’s • Impossible to directly compare speech samples to identify characteristics
MFCC’s • Mel Frequency Cepstral Coefficients • Models the sound samples in a way that is similar to the way the human ear perceives sound • Utterance is broken into windows, ~10mS • Apply a hamming window to the section
MFCC’s • FFT to obtain the log frequency spectrum • Mel Filters applied to log FFT output • DCT applied to Mel Filter output • Result is 12 MFCC’s • Compute energy of the signal
Training Data – ‘mark_words.m’ • Obtain data containing the words you wish to build an acoustic model for • Create a transcription file for each utterance
Training Data – ‘model.m’ • For each part of a word to be analyzed, collect all feature vectors associated • Repeat for each utterance • Calculate the means and variances for each dimension of the feature vector for a given part of the word
Training Data – ‘evaluate.m’ • Implements classification of sound • Compares current feature vector to the acoustic model • Determines if probability is high enough to transition states
Issues Encountered • Small training data set • Some digits need more than two sections • Time constraints of making all transcriptions • 1,3,5,7 models fairly reliable • 2,4,6 models need refinement
Future Goals • Implement true HMM system for classification • Add automated training and re-evaluation of the HMM • Apply techniques to analysis of breathing sounds for sleep disorder recognition