SPEECH CODING

1. SPEECH CODING & APPLICATION

2. Introduction to Speech & Waveform coder By: Ahmed Mohamed Elshaer

3. What is the Speech? Speech is the primary method of human communication. To transmit/store a speech waveform using as few bits as possible while retaining high quality

4. Why Speech Coding? speech coding systems is to transmit speech with the highest possible quality using the least possible channel capacity. To save bandwidth in telecoms applications and to reduce memory storage requirements.

5. Speech Process: 1- Production:

6. Speech Process: 2- Propagation: the sound waves propagate through the air at a speed of 300 m/s, reaching the listener�s ears.

7. Speech Process: 3-��Perception: the incoming sounds are deciphered by the listener into a received message, thereby completing the chain of events that culminated in the transfer of information from the speaker to the listener.

8. The Vocal Tract:

9. The Vocal Tract:

10. Sources of Sound Energy: 1- Turbulence: air moving quickly through a small hole (e.g./s/ in �size�). 2- Explosion: pressure built up behind a blockage is suddenly released (e.g. /p/ in �pop�). 3- Vocal Fold Vibration: like the neck of a balloon (e.g./a/ in �hard�).

11. Speech Sound Categories: 1-Voiced: speech sound where the vocal tract folds vibrate. 2-Vowels: no blockage of the vocal tract and no turbulence

12. Speech Sound Categories: 3-Consonants: non-vowels. 4-Plosives: consonants involving an explosion

13. The Vocal Tract Filter

14. Speech Spectrograme: Ex: my speech

15. Speech Coding Hierarchy:

16. Characteristic of Speech Signals: 1-Probability Density Function(PDF): the pdf of speech signal is in general characterized by a very high probability of near zero amplitudes, a significant probability of very high amplitudes

17. Characteristic of Speech Signals: 2-Autocorrelation Function (ACF): The ACF gives a quantitative measure of the closeness or similarity between samples of a speech signal as a function of their time separation.

18. Characteristic of Speech Signals: 3-Power Spectral Function (PSD): the nonflat characteristic of the power spectral density of speech makes it possible to obtain significant compression by coding speech in the frequency domain. The SFM is defined as the ratio of the arithmetic to geometric mean of the samples of the PSD taken at uniform intervals in frequency .

19. Quantization Techniques

20. 1-Uniform Quantization Quantization is the process of mapping a continuous range of amplitudes of a signal into a finite set of discrete amplitudes. Quantizers can be thought of as devices that remove the irrelevancies in the signal and uses n bit can have M=2^n levels. The SQNR of a PCM encoder

21. 2-Non Uniform Quantization: Nonuniform quantizers with the feature that the step-size increase as the separation from the origin of the input-output amplitude characteristic is increased

22. Non Uniform Quantization: Compression law: In US ?-law In Europe A-law

23. Non Uniform Quantization: Compander Compressor Expandor Compressor + Expandor = Compandor

24. 3-Adaptive Quantization: Adaptive quantization with forward estimation (AQF). Adaptive quantization with backward estimation (AQB)

25. 4-Vector Quantization: The vector quantizer that use blocks of consecutive samples of the source output to form vectors The vector is encoded by comparing it with codebook consisting of a set of stored reference vectors known as code vectors or patterns the coded transmission rate in bits per sample

26. ADPCM PCM:speech to be encoded at a bit rate of 64 kbps ADPCM: speech to be encoded at a bit rate of 32 kbps G.721, CT2 and DECT

27. ADPCM

28. Frequancy Domain Coding: 1- Sub Band Coding: divide the entire speech band into unequal sub bands that contribute equally to the articulation index Sub band Number Frequency range 1 200-700 Hz 2 700-1310 Hz 3 1310-2020 Hz 4 2020-3200 Hz Sub band coding can be used for coding speech at bit rates in the range 9.6 kbps to 32 kbps

29. Adaptive Transform Coding: encode speech at bit rates in range 9.6 kbps to 20 kbps. which involves block transformations of windowed input segments of the speech waveform

SPEECH CODING

SPEECH CODING

Presentation Transcript

Speech-Coding Techniques

Speech Coding Techniques

Multiple Description Speech Coding

Speech & Audio Coding

Speech Coding

Basics of speech coding

Speech Coding Using LPC

Speech Coding EE 516 Spring 2009

A Recognition Model for Speech Coding

Linear Predictive Coding for Speech Compression

Speech Coding

Speech Coding Examples

Speech and Audio Processing and Coding

Speech-Coding Techniques

What is speech coding?

Speech Coding Basics

Speech coding

Speech Coding (Part I)  Waveform Coding

Speech and Audio Coding

Frequency Domain Coding of Speech

Linear Predictive Coding for Speech Compression

Scalable Speech Coding for IP Networks

SPEECH CODING

SPEECH CODING

Presentation Transcript

Speech-Coding Techniques

Speech Coding Techniques

Multiple Description Speech Coding

Speech &amp; Audio Coding

Speech Coding

Basics of speech coding

Speech Coding Using LPC

Speech Coding EE 516 Spring 2009

A Recognition Model for Speech Coding

Linear Predictive Coding for Speech Compression

Speech Coding

Speech Coding Examples

Speech and Audio Processing and Coding

Speech-Coding Techniques

What is speech coding?

Speech Coding Basics

Speech coding

Speech Coding (Part I)  Waveform Coding

Speech and Audio Coding

Frequency Domain Coding of Speech

Linear Predictive Coding for Speech Compression

Scalable Speech Coding for IP Networks

Speech & Audio Coding