1 / 12

Chapter 18 Speech Encoding and Decoding

Chapter 18 Speech Encoding and Decoding. 18.1 Introduction to Speech Encoding and Decoding 18.2 Speech Signal Encoding Systems 18.3 Main Performance Indices of Speech Encoding 18.4 The development of Speech Encoding 18.5 The Ways to Enhance the Quality of Speech Encoding.

derron
Download Presentation

Chapter 18 Speech Encoding and Decoding

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 18 Speech Encoding and Decoding • 18.1 Introduction to Speech Encoding and Decoding • 18.2 Speech Signal Encoding Systems • 18.3 Main Performance Indices of Speech Encoding • 18.4 The development of Speech Encoding • 18.5 The Ways to Enhance the Quality of Speech Encoding

  2. 18.1 Introduction to Speech Encoding and Decoding (1) • PCM is the origin of waveform encoding • Waveform encoding tries to reconstruct the speech waveform to keep the original waveform as possible • It processes speech signal as a general signal. The advantages are : strong adaptive capacity, good speech quality; but encoding rate is high. • It includes various encoder : PCM, ADM, ADPCM, APC, ASBC and ATC. • These encoders give high encoding quality in 64-16 kb/s. When rate is decreased further, the performance will down quickly.

  3. Introduction to Speech Encoding and Decoding (2) • Vocoder is the origin of parametric encoding • Parametric Encoding tries to extract and encode the feature parameter of speech signal; tries to make the reconstructed speech signal having as high intelligibility as possible, but the waveform may have a big difference from that of the original signal. The advantages are : low encoding rate with 2.4kb/s or less; but the problems are : poor quality of the decoded speech and environment sensitive. • Channel Vocoder, Formant Vocoder, and wide used LP Vocoder are typical parametric encoders.

  4. Introduction to Speech Encoding and Decoding (3) • Since 1970’s, in particular 1980’s speech encoding technique got breakthrough. A kind of very effective algorithm have been proposed. They are new generation of parameter encoding algorithm to using mixed encoding. • These algorithms overcome the weakness of both waveform encoding and parameter encoding and use both advantages. They got high quality of speech on 4-16 kb/s. • These encoders includes MPLPC, RPELPC and CELP

  5. 18.2 Speech Signal Encoding Systems (1) • There are two categories of speech signal encoding systems : • Encoding-storage-replay called of digital speech recording-playing system. • Encoding-transmission-decoding called digital telephone communication system. • Digital system is much better than analog system

  6. 18.3 Main Performance Indices of Speech Encoding (1) • There are four factors affecting the encoding : • Encoding Quality, Encoding Rate, Complexity of encoding and decoding, Delay of encoding and decoding • (1) Encoding Quality • Subjective and objective evaluation approaches • Objective evaluation is based on objective measurement. Often used approaches are : Signal noise ratio, Weighted signal noise ratio and Average segmented signal noise ratio. • The advantage is simple calculation but could not reflect the perception of human for speech quality.

  7. Main Performance Indices of Speech Encoding (2) • So objective evaluation only applied to higher rate of waveform encoding (16kb/s or more) • Subjective evaluation is based on the human perception testing and averaging the scores by human. • Often used approaches are: MOS (Mean Option Score), • DRT (Diagnostic Rhyme Test) and • DAM (Diagnostic Acceptability Measure)

  8. Main Performance Indices of Speech Encoding (3) • (2) Encoding Rate • It is represented by I(b/s). Some time it is represented by R(b/p). It means how many bits per sample. • I = R*fs , fs is the sampling frequency which is determined by signal bandwidth according to Shannon theorem. • The higher the R is, the higher the speech quality is and the higher requirements for transmission or storage are. Generally R>=2 for waveform encoding, but R=0.25 and less for parameter encoding.

  9. Main Performance Indices of Speech Encoding (4) • (3) Complexity of encoding and decoding • The complexity has close relation to the speech quality and encoding rate. If rate is fixed, more complex algorithm will have better quality or for same quality complex algorithm could reduce the rate. • Complexity is implement by specific DSP hardware. • (4) Delay of encoding and decoding • It is related with the complexity of the algorithm. It will effect the real speech communication system. Echoes. The requirement is delay<5-10ms for real time system.

  10. 18.4 The development of Speech Encoding • International Standards of speech encoding for telephone bandwidth • The relation between the quality and rate • The development progress of speech encoding

  11. 18.5The Ways to Enhance the Quality of Speech Encoding (1) • There are only two ways : to utilize the redundancy of speech signal and to utilize the auditory features of human ear. • The redundancy of speech signal is from two aspects: non-uniform distribution of speech signal amplitude and correlation of sampling points. • The distribution of speech signal amplitude includes Gaussian, Gamma and Laplacian distributions. The common feature is that small amplitude has bigger probability density and vice versa. So non-uniform quantization utilized this feature to get higher quality.

  12. The Way to Enhance the Quality of Speech Encoding (2) • A law and μlaw PCM use 8 bit non-uniform quantization to get the effect of 12-13 bit uniform quantization. • How to utilize other features please see the book. (p179)

More Related