Keyword Spotting Dynamic Time Warping

Keyword SpottingDynamic Time Warping Ali Akbar Jabini Alexandre Mercier-Dalphond Spring 2006

Introduction • Speech recognition: • Computer can interpret speech • Need input to digitalize sounds • Microphone • People can speak faster than type • Commercial systems available since 1990s • People prefer Physical interactions • Keyboard/Mouse, On/Off switch • Low Accuracy for large vocabulary with noise (50%)

Introduction • Speech recognition is more and more used for smaller vocabulary banks • Credit Card Systems • Simple switching commands • Directory assistance • Cheap to implement • High Accuracy • Can verify their interpretation • Idea: speech recognition for household appliances

OUTLINE • Area of investigation • Concrete task/Goal • Schematic • Feature extraction • DTW • Training • Evaluation metrics • Conclusion

Area of Investigation • Keyword Spotting: • Subfield of speech recognition • Grammar constrained • Keyword Spotting in isolated word recognition • Keywords utterances • Keyword separated by silence • Main technique is DTW

Concrete task/Goal • Goal: develop a robust speaker independent keyword spotting scheme to operate household appliances • Concrete tasks • Digitalize the sound inputs • Implementation in MatLab • Train the model with the grammar • Analyze the performances of our scheme

Schematic Microphone A/D Feature extraction DTW Output Grammar

Feature extraction • Pre-emphasis • Flattening the spectrum of the signal • Blocking into frames • Length of the Fourier Transform • Windowing • Sample window (maybe Hamming) • Mel frequency Cepstral coefficients • More reliable than LPC coefficients • This will be imputed in the DTW algorithm

DTW • Idea: smallest distance between an input and the training bank • Cepstrum features • Dynamic programming: the time axis his not linear to account for utterances • t0 -> t0+5 • t1 -> t1-2

DTW

Training • Need to create our own grammar • On: Onnn, Honnn, open, opeeenn • Off: Hooofff, Hoff, offfff, close • As many potential utterances as possible • Use this data with DTW

Evaluation metrics • Accuracy • High noise • Low noise • Independent speaker • Training data speaker • Would like to obtain 80% or more

Conclusion • Early stage • No code implemented yet • Many challenges a head • Our methodology may change slightly • There is a big potential market for such technique -> influence on every day life.

Keyword Spotting Dynamic Time Warping

Keyword Spotting Dynamic Time Warping

Presentation Transcript

Time Series and Dynamic Time Warping

Parallelizing Dynamic Time Warping

Using Dynamic Time Warping for Sleep and Wake Discrimination

A STUDY ON SPEECH RECOGNITION USING DYNAMIC TIME WARPING

Instruction Set Extension for Dynamic Time Warping

Searching and Mining Trillions of Time Series Subsequences under Dynamic Time Warping

Keyword Spotting Using Crosscorrelation

Dynamic Time Warping for Automated Cell Cycle Labelling

Dynamic Match Lattice Spotting

Exact indexing of Dynamic Time Warping

Dynamic Time Warping Applications and Derivation

Incorporating Dynamic Time Warping (DTW) in the SeqRec.m File

FastDTW: Toward Accurate Dynamic Time Warping in Linear Time and Space

Qualitative approximation to Dynamic Time Warping similarity between time series data

Exact Indexing of Dynamic Time Warping

Variable Penalty Dynamic Time Warping For Aligning Chromatography Data

Dynamic Time Warping

DYNAMIC TIME WARPING IN KEY WORD SPOTTING

Dynamic Time Warping (DTW)