450 likes | 934 Views
Audio signal processing ver1d. 2. Overall View. Chapter 1 : IntroductionChapter 2 : Signals in time
E N D
1. Audio signal processing ver1d 1 Introduction to audio signal processingpart1 KH WONG, Rm 907 Engineering Building, CSE Dept. CUHK,
Email: khwong@cse.cuhk.edu.hk
http://www.cse.cuhk.edu.hk/~khwong/cmsc5707
Reference book:
Lawrence Rabiner , Ronald Schafer , Theory and Applications of Digital Speech Processing, Pearson 2011
Becchetti and Ricotti, Speech recognition: Theory and C++ Implementation, Wiley. 1999
Lawrence Rabiner and Biing-Hwang Juang , Fundamentals of Speech Recognition, Prentice Hall 1993
2. Audio signal processing ver1d 2 Overall View Chapter 1 : Introduction
Chapter 2 : Signals in time & frequency domain
Chapter 3 : Audio feature extraction techniques
Chapter 4 : Recognition Procedures
3. Audio signal processing ver1d 3 Chapter 1: Part 1
Chapter 1 : Introduction
Chapter 2 : Signals in time & frequency domain
4. Audio signal processing ver1d 4 Chapter 1: introduction Content
Components of a speech recognition system
Types of speech recognition systems
speech recognition Hardware
A speech production model
Phonetics: English and Cantonese
5. Audio signal processing ver1d 5 Components of A speech recognition system Pre-processor
Feature extraction
Training of the system
Recognition
6. Audio signal processing ver1d 6 Types of speech recognition technology Isolated speech recognition - the speaker has to speak word-by-word into the system. (
Connected speech recognition - the speaker can speak a number of words without stopping.
Continuous speech recognition - like human.
Current product: Voice Actions for Android
http://googlemobile.blogspot.com/2010/08/just-speak-it-introducing-voice-actions.html
7. Audio signal processing ver1d 7 Types depending on speakers Speaker dependent recognition - designed for one speaker who has trained the system.
Speaker independent recognition - designed for all users without prior training.
8. Audio signal processing ver1d 8 Exercise 1 Discuss the features of the speech recognition module in the following systems
Mobile phone, speech command dialing system
PC based Chinese voice input system
9. Audio signal processing ver1d 9 Conversion time and sampling time Human freq. range 20Hz to 20KHz, Sampling is double of the highest freq. (sampling theory). So sampling for Hi-Fi music > 40KHz. 74 minutes CD music, 44.1KHz sampling 16-bit sound=44.1KHz*2bytes*2channels*60seconds*70min.=783,216,000 bytes (747~ MB). (see http://en.wikipedia.org/wiki/CD-ROM) Compromise: telephone quality sound is 8KHz 8-bit sampling.