70 likes | 292 Views
Ways to generate computer speech. Record a human speaking every sentence HAL will ever speak (not likely) Make a mathematical model of the human vocal tract (synthesis)
E N D
Ways to generate computer speech • Record a human speaking every sentence HAL will ever speak (not likely) • Make a mathematical model of the human vocal tract (synthesis) • Record a human speaking a lot of sentences, and come up with some way of making new sentences out of the recorded ones (concatenation)
What goes into synthesizing speech? • Have some idea of what human speech actually looks/sounds like • Modeling the shape of a speaker’s mouth • Fricative noises and noises from stops • Pitch changes • Produce sounds that resemble speech sounds
Synthesis: Putting it all together • Shape of mouth: 1: 2: 3: all 3: • Fricative and burst noises: • Shape of mouth and fricative noises: • Shape of mouth, fricative noises, & pitch:
Speech synthesis • (1980): The Speak & Spell toy used a synthesis process called Linear Predictive Coding (LPC). • Basically, LPC is a way for a computer to extract all of the different parts of speech from a speech signal, and re-create them using a mathematical model of the vocal tract • Here’s a better example of LPC (1982): • LPC is used today for GSM phone systems
Text-to-Speech (TTS) systems • Concatenative synthesis • Record natural speech • Chop speech up into units • Recombine units according to the phonetic transcription to be pronounced • Steps for a TTS system: • Start w/ written text • Convert text to phonetic characters • Find segments of speech in database • Calculate intonation of sentence
Text-to-Speech (TTS) systems Examples of text from The North Wind and the Sun (Aesop), circa 2005: • Mike (AT&T) • Crystal (AT&T) • British English (Rhetorical Systems) • Scottish English (Rhetorical Systems)