140 likes | 234 Views
Midter m Review. Spoken Language Processing Prof. Andrew Rosenberg. Lecture 1 - Overview. Applications speech recognition speech synthesis other applications: indexing, language id , etc. Information in speech words speaker identity speaker state discourse acts.
E N D
Midterm Review Spoken Language Processing Prof. Andrew Rosenberg
Lecture 1 - Overview • Applications • speech recognition • speech synthesis • other applications: indexing, language id, etc. • Information in speech • words • speaker identity • speaker state • discourse acts
Lecture 2 – From Sounds to Language • Differences between orthography and sounds • Phonetic symbol sets • e.g. IPA, ARPAbet. • Vocal organs • articulators • Classes of sounds • Coarticulation
Lecture 3 – Spoken Dialog Systems • Maxims of Conversational Implicature • Dialog System Architecture • Speech Recognition • Dialog Management • Response Generation • Speech Synthesis • Dialog Strategies
Lecture 4 – Acoustics of Speech • Phone Recognition • Prosody • Speech Waveforms • Analog to Digital Conversion • Nyquist Rate • Pitch Doubling and Halving
Lecture 5 – Speech Recognition Overview • History of Speech Recognition • Rule based recognition • Dynamic Time Warping • Statistical Modeling • What are qualities that make speech recognition difficult? • Noisy Channel Model • Training and Test Corpora • Word Error Rate
Lecture 6 – Fast Fourier Transform • Multiplying Polynomials • Divide-and-Conquer for multiplying polynomials. • Relationship between multiplying polynomials and cosine transform • Complex roots at unity
Lecture 7 - MFCC • What is the MFCC used for? • Overlapping Windows • Mel Frequency • Spectrogram
Lecture 8 – Statistical Modeling • Probabilities • Bayes Rule • Bayesians vs. Frequentists • Maximum Likelihood Estimation • Multinomial Distribution • Bernoulli Distribution • Gaussian Distribution • Multidimensional Gaussian • Difference between Classification, Clustering, Regression • Black Swans and the Long Tail
Lecture 9 – Acoustic Modeling • What does an Acoustic Model do? • Gaussian Mixture Model • Potential Problems • Inconsistent Numbers of Gaussians • Singularities • Training Acoustic Models.
Lecture 10 – Hidden Markov Model • The Markov Assumption • Difference between states and observations • Finite State Automata • Decoding using Viterbi • Forced Alignment • Flat Start • Silence
Lecture 11 - Pronunciation Modeling • Dictionary • Finite State Automata • Use in speech recognition • Using morphology for pronunciation modeling • Grapheme to Phoneme Conversion • Letter to Sound rules • Machine Learning for G-to-P
Lecture 12 – Language Modeling • Using a Context Free Grammar to define a set of recognized sequences of words. • Terminals, non-terminals, start state • N-Gram models • Mathematical underpinnings • Theoretical background • How a “word” is defined. • Learning n-gram statistics • Terminology
Next Class • Midterm Exam • Reading: J&M Chapter 4