Speech Coding Using LPC

Speech CodingUsing LPC

What is Speech Coding • Speech coding is the procedure of transforming speech signal into more compact form for • Transmission • Available Bandwidth • Encryption

Uncompressed Speech signal • Analog speech is a bandpassed signal between 200 and 3400 Hz. • Uncompressed digital speech is a bit stream at 64kB/s. • Transmission technology must • transmit the signals from point A to point B: • with minimum degradation • using minimum bandwidth

Speech coding • By coding we mean an efficient representation of the signal – COMPRESSION • The main approaches: • waveform coding • transform coding • Parametric / hybrid coding } smart quantizers

How each of these works: • Waveform coders:try to find an efficient representation of the waveform, directly. • Transform coders: try to find an efficient representation in the frequency domain. • Parametric coders: try to find a small set of parameters that are an efficient representation of the signal. { FFT, etc. speech exc.

Comparison of speech coders

LPC (Linear Predictive coding) • LPC is a model for signal production: it is based on the assumption that the speech signal is produced by a very specific model.

Speech Production in Humans • The speech signal is created by: • A pressure source (lungs), exciting ... • A Filter (Vocal tract: pharynx - mouth [soft palate, tongue] - nasal cavity)

For DSP Engineer • An excitation source • A time varying filter filter: speech Excitation H(t, )

The model and its representation • The LPC model looks at speech as: • Excitation: • periodic (voiced) - originating in the larynx • noise (unvoiced) - fricative, produced in the mouth • An all-pole filter representing the vocal tract all pole filter: . . . . H( )

Block Diagram

Why the name “Linear Predictive Coding” • It is assumed that the new sample is the weighted linear combination of previous samples

Z-Plane Representation • In the z-plane we can write the model as a transfer function: • Clearly this transfer function has only poles - which is why it represents an all pole filter.

Mathematical analysis • Reminder: our problem is to find the LPC parameters, for a given speech signal. This is called the Inverse Problem. • How do we find the set of parameters that gives the best match to the signal?

What are these Parameters • The Coefficients of the All Pole Filter • Pitch of the speech

How do we find the Coefficients: • least squares • Formulation: • Given a signal s(n); • Defining an error as: • Find the set of that will minize the mean square error:

Solution: • Simply equate the derivative of E to zero: • Which gives us the Normal Equations: • These are no more than p linear equations in p unknowns...

Or in matricial form:

What is each element of the form- • A correlation;in other words: • take the signal, multiply it by a shifted version, and sum. • Since our signal is long and time varying- we did it on short windows • Two variants: • autocorrelation method • covariance method

Solving the Matrix • Found the Coefficients a(i) by Using the Levinson-Durbin recursion method

Second Parameter • Pitch was found by the finding the correlation of the signal window with itself • Then these parameters were transmitted

Bit rate for plain LPC vocoder

Bit rate for voice-excited LPC vocoder with DCT

Conclusion • Sound produced through LPC method is not exactly the real sound but it sounds intelligibly understandable • LPC can be used in Speech recognition systems • LPC was widely used in Military because of low bit rate in transmission • There are many variants over the basic scheme: LPC-10, CELP, MELP, RELP, VSELP, ASELP, LD-CELP...

Speech Coding Using LPC