130 likes | 198 Views
Speech recognition 2 DAY 15 – Sept 30, 2013. Brain & Language LING 4110-4890-5110-7960 NSCI 4110-4891-6110 Harry Howard Tulane University. Course organization. The syllabus, these slides and my recordings are available at http://www.tulane.edu/~howard/LING4110/ .
E N D
Speech recognition2DAY 15 – Sept 30, 2013 Brain & Language LING 4110-4890-5110-7960 NSCI 4110-4891-6110 Harry Howard Tulane University
Brain & Language, Harry Howard, Tulane University Course organization • The syllabus, these slides and my recordings are available at http://www.tulane.edu/~howard/LING4110/. • If you want to learn more about EEG and neurolinguistics, you are welcome to participate in my lab. This is also a good way to get started on an honor's thesis. • The grades are posted to Blackboard.
Brain & Language, Harry Howard, Tulane University Review Pitch shows fundamental frequency (F0) Spectrogram shows formants (F1-3) Sound wave
Brain & Language, Harry Howard, Tulane University speech recognition Ingram §5
Brain & Language, Harry Howard, Tulane University • use Praat in class
Brain & Language, Harry Howard, Tulane University Vowel articulation • Tongue height: high, (mid), low • put your hand under your jaw and say the vowel of: • mat, met, mate, mitt, meat • meat, mitt, mate, met, mat • Tongue advancement: front, central, back • Lip configuration: rounded, neutral, retracted
Brain & Language, Harry Howard, Tulane University Vowel description
Wide band spectrograms of the vowels of American English in a /b__d/ context. Top row, left to right: [i, ɪ, eɪ, ɛ, æ]. Bottom row, left to right: [ɑ, ɔ, o, ʊ, u]. Brain & Language, Harry Howard, Tulane University Sample vowel spectrograms
Brain & Language, Harry Howard, Tulane University Acoustic cues and distinctive features • Three problems • Input signal • Internal representation • Interface between (a)and (b) • Lexical information retrieval • but we only need the phonological form of a lexical item
Brain & Language, Harry Howard, Tulane University Why speech recognition is difficult • The segmentation problem • The variability problem • coarticulation • The speaking environment • Speakers’ vocal tracts • Speech rate and style • Rate of information transmission
Brain & Language, Harry Howard, Tulane University Lexical retrieval • Speech perception involves phonological parsing prior to lexical access • It is not enough to know the lexicon beforehand. • Phonetic forms and phonological representations • Speech/speaker normalization • Distinctive features and acoustic cues • Underspecified vs. fully specified • Discrete vs. continuous • Hierarchical organization vs. entrainment
Brain & Language, Harry Howard, Tulane University NEXT TIME Finish Ingram §6. ☞ Go over questions at end of chapter.