170 likes | 438 Views
ENEE408G: Capstone Design Project: Multimedia Signal Processing Design Project 1: Digital Speech Processing. Outline of Design Project 1. Part I : Speech Analysis Part II : Speech Coding: Linear Predictive Vocoder Part III: Speech Recognition by IBM ViaVoice Part IV: Speech Synthesis
E N D
ENEE408G: Capstone Design Project: Multimedia Signal Processing Design Project 1: Digital Speech Processing ENEE408G Fall 2005 Multimedia Signal Processing
Outline of Design Project 1 • Part I : Speech Analysis • Part II : Speech Coding: Linear Predictive Vocoder • Part III: Speech Recognition by IBM ViaVoice • Part IV: Speech Synthesis • Part V : Human Computer Interface • Part VI: Mobile Computing and Pocket PC Programming ENEE408G Fall 2005 Multimedia Signal Processing
Adjust the Microphone Device • Use Sound Recorder • By accessoriesentertainmentsound recorder • Select Line-In 2/Mic 2 • By Editaudio propertiessound recording Volume ENEE408G Fall 2005 Multimedia Signal Processing
Part I. Speech Analysis (1) • Human Vocal Apparatus ENEE408G Fall 2005 Multimedia Signal Processing
Part I. Speech Analysis (2) • Vocal Tract Model ENEE408G Fall 2005 Multimedia Signal Processing
Part I. Speech Analysis (3) • COLEA toolbox: • Waveform on Time Domain • Spectrogram • Pitch and Formant Tracking • LPC Spectra • Record your own voice and analyze pitch and formants. ENEE408G Fall 2005 Multimedia Signal Processing
Part I. Speech Analysis (4) ENEE408G Fall 2005 Multimedia Signal Processing
Part I. Speech Analysis (5) • Gender Identification: • Use Auditory Toolbox to obtain Linear Predictive coefficients. • Design your algorithm to identify the gender of samples in the training set. • Test your algorithm on 9/26 by new samples. ENEE408G Fall 2005 Multimedia Signal Processing
Pat II. Linear Predictive Vocoder: Encoder • Encoder: ENEE408G Fall 2005 Multimedia Signal Processing
Part II. Linear Predictive Vocoder:Decoder ENEE408G Fall 2005 Multimedia Signal Processing
Part III. Speech Recognition • IBM ViaVoice • ViaVoice Training: • Operate PC by ViaVoice ENEE408G Fall 2005 Multimedia Signal Processing
Part III. IBM ViaVoice Training • Start from BLUE word. • Keep specking, the recognized words become GRAY. • If you hear sounds or the BLUE sign stop in a specific word, return to the blue word and read the BLACK sentence again. ENEE408G Fall 2005 Multimedia Signal Processing
Part III. IBM ViaVoice Dictation Menu Bar: 1. Menu Button 2. Microphone State 3. Status Area 4. ViaCenter Help 5. Current User • Speak Pad ENEE408G Fall 2005 Multimedia Signal Processing
Part IV. Speech Synthesis • Text-To-Speech and Talking Head • Demo • Vowel Synthesis ENEE408G Fall 2005 Multimedia Signal Processing
Part V. Human Computer Interface • CSLU Human Computer Interface • Rapid Application Developer (RAD) • StartSpeech Toolkit RAD • MIT Galaxy System • JUPITER: Weather Information System • http://www.sls.lcs.mit.edu/sls/applications/jupiter.shtml • TEL: 1-888-573-8255 • PEGASUS: Airline Flight Planning System • http://www.sls.lcs.mit.edu/sls/applications/pegasus.shtml • TEL: 1-877-527-8255 ENEE408G Fall 2005 Multimedia Signal Processing
Part VI. Pocket PC Programming • Apply what you learned from previous parts and design a simple application related to digital speech processing by Microsoft eMbedded Tools for Pocket PC. ENEE408G Fall 2005 Multimedia Signal Processing
Announcement • Matlab task: Part II • C++ task: Part VI • Check out Pocket PC ENEE408G Fall 2005 Multimedia Signal Processing