620 likes | 644 Views
Explore analysis and synthesis of pole-zero speech models using linear prediction to derive vocal tract models of different sound sources. Learn about all-pole modeling, estimation methods, and error minimization in speech signal processing. Dive into examples and techniques for efficient modeling.
E N D
Microcomputer Systems 2 Analysis and Synthesis of Pole-Zero Speech Models
Introduction • Deterministic: • Speech Sounds with periodic or impulse sources • Stochastic: • Speech Sounds with noise sources • Goal is to derive vocal tract model of each class of sound source. • It will be shown that solution equations for the two classes are similar in structure. • Solution approach is referred to as linear predication analysis. • Linear prediction analysis leads to a method of speech synthesis based on the all-pole model. • Note that all-pole model is intimately associated with the concatenated lossless tube model of previous chapter (i.e., Chapter 4). Veton Këpuska
All-Pole Modeling of Deterministic Signals • Consider a vocal tract transfer function during voiced source: Ug[n] A … s[n] GlottalModel Vocal TrackModel RadiationModel Speech T=pitch V(z) G(z) R(z) Veton Këpuska
All-Pole Modeling of Deterministic Signals • What about the fact that R(z) is a zero model? • A single zero function can be expressed as a infinite set of poles. Note: • From the above expression one can derive: Veton Këpuska
All-Pole Modeling of Deterministic Signals • In practice infinite number of poles are approximated with a finite site of poles since ak0 as k∞. • H(z) can be considered all-pole representation: • representing a zero with large number of poles ⇒ inefficient • Estimating zeros directly is a more efficient approach (covered later in this chapter). Veton Këpuska
Model Estimation • Goal - Estimate : • filter coefficients {a1, a2, …,ap}; for a particular order p, and • A, Over a short time span of speech signal (typically 20 ms) for which the signal is considered quasi-stationary. • Use linear prediction method: • Each speech sample is approximated as a linear combination of past speech samples ⇒ • Set of analysis techniques for estimating parameters of the all-pole model. Veton Këpuska
Current Sample Past Samples Scaling Factor –Linear Prediction Coefficients Input Model Estimation • Consider z-transform of the vocal tract model: • Which can be transformed into: • In time domain it can be written as: • Referred to us as a autoregressive (AR) model. Veton Këpuska
Model Estimation • Method used to predict current sample from linear combination of past samples is called linear prediction analysis. • LPC– Quantization of linear prediction coefficients or of a transformed version of these coefficients is called linear prediction coding. • For ug[n]=0 • This observation motivates the analysis technique of linear prediction. Veton Këpuska
Estimate of s[n] Model Estimation: Definitions • A linear predictor of order p is defined by: Estimate of ak z Veton Këpuska
s[n] ˜ P[z] e[n]=Aug[n] s[n] Model Estimation: Definitions • Prediction error sequenceis given as difference of the original sequence and its prediction: • Associated prediction error filter is defined as: • If {k}={ak} A(z) Veton Këpuska
Aug[n] s[n] Model Estimation: Definitions • Note 1: • Recovery of s[n]: Veton Këpuska
Model Estimation: Definitions • Note 2: If • Vocal tract contains finite number of poles and no zeros, • Prediction order is correct, then • {k}={ak}, and • e[n] is an impulse train for voiced speech and for impulse speech e[n] will be just an impulse. Veton Këpuska
Example 5.1 • Consider an exponentially decaying impulse response of the form h[n]=anu[n] where u[n] is the unit step. Response to the scaled unit sample A[n] is: • Consider the prediction of s[n] using a linear predictor of order p=1. • It is a good fit since: • Prediction error sequence with 1=a is: • The prediction of the signal is exact except at the time origin. Veton Këpuska
Error Minimization • Important question is: how to derive an estimate of the prediction coefficients al, for a particular order p, that would be optimal in some sense. • Optimality is measured based on a criteria. An appropriate measure of optimality is mean-squared error (MSE). • Goal is to minimize the mean-squared prediction error: E defined as: • In reality, a model must be valid over some short-time interval, say M samples on either side of n: Veton Këpuska
Error Minimization • Thus in practice MSE is time-depended and is formed over a finite interval as depicted in previous figure. • [n-M,n+M] – prediction error interval. • Alternatively: where Veton Këpuska
Error Minimization • Determine {k} for which En is minimal: • Which results in: Veton Këpuska
Error Minimization • Last equation can be rewritten by multiplying through: • Define the function: • Which gives the following: • Referred to as the normal equations given in the matrix form bellow: Veton Këpuska
Error Minimization • The minimum error for the optimal solution can be derived as follows: • Last term in the equation above can be rewritten as: Veton Këpuska
Error Minimization • Thus error can be expressed as: Veton Këpuska
Error Minimization • Remarks: • Order (p) of the actual underlying all-pole transfer function is not known. • Order can be estimated by observing the fact that a pth order predictor in theory equals that of a (p+1) order predictor. • Also predictor coefficients for k>p equal zero (or in practice close to zero and model only noise-random effects). • Prediction error en[m] is non-zero only “in the vicinity” of the time n: [n-M,n+M]. • In predicating values of the short-time sequence sn[m], p –values outside of the prediction error interval [n-M,n+M] are required. • Covariance method – uses values outside the interval to predict values inside the interval • Autocorrelation Method – assumes that speech samples are zero outside the interval. Veton Këpuska
Error Minimization • Matrix formulation • Projection Theorem: • Columns of Sn – basis vectors • Error vector en is orthogonal to each basis vector: SnTen=0; where • Orthogonality leads to: Veton Këpuska
Autocorrelation Method • In previous section we have described a general method of linear prediction that uses samples outside the prediction error interval referred to as covariance method. • Alternative approach that does not consider samples outside analysis interval, referred to as autocorrelation method, will be presented next. • This method is: • Suboptimal, however it • Leads to an efficient and stable solution to normal equations. Veton Këpuska
Autocorrelation Method • Assumes that the samples outside the time interval [n-M,n+M] are all zero, and • Extends the prediction error interval, i.e., the range over which we minimize the mean-squared error to ±∞. • Conventions: • Short-time interval: [n, n+Nw-1] where Nw=2M+1 (Note: it is not centered around sample n as in previous derivation). • Segment is shifted to the left by n samples so that the first nonzero sample falls at m=0. This operation is equivalent to: • Shifting of speech sequence s[m] by n-samples to the left and • Windowing by Nw -point rectangular window: Veton Këpuska
Windowed sequence can be expressed as: This operation can be depicted in the figure presented on the right. Autocorrelation Method Veton Këpuska
Autocorrelation Method • Important observations that are consequence of zeroing the signal outside of interval: • Prediction error is nonzero only in the interval [0,Nw+p-1] • Nw-window length • p-the predictor order • The prediction error is largest at the left and right ends of the segment. This is due to edge effects caused by the way the prediction is done: • from zeros – from the left of the window • to zeros – from the right of the window Veton Këpuska
Autocorrelation Method • To compensate for edge effects typically tapered window is used (e.g., Hamming). • Removes the possibility that the mean-squared error be dominated by end (edge) effects. • Data becomes distorted hence biasing estimates: k. • Let the mean-squared prediction error be given by: • Limits of summation refer to new time origin, and • Prediction error outside this interval is zero. Veton Këpuska
Autocorrelation Method • Normal equations take the following form (Exercise 5.1): • where Veton Këpuska
Autocorrelation Method • Due to summation limits depicted in the figure on the right function n[i,k] can be written as: • Recognizing that only samples in the interval [i,k+Nw-1] contribute to the sum, and • Changing variable m⇒ m-i: Veton Këpuska
Autocorrelation Method • Since the above expression is only function of difference i-k thus we denote it as: • Letting =i-k, referred to as correlation “lag”, leads to short-time autocorrelation function: Veton Këpuska
Autocorrelation Method rn[]=sn[]*sn[-] • Autocorrelation method leads to computation of the short-time sequence sn[m] convolved with itself flipped in time. • Autocorrelation function is a measure of the “self-similarity” of the signal at different lags . • When rn[] is large then signal samples spaced by are said to by highly correlated. Veton Këpuska
Autocorrelation Method • Properties of rn[]: • For an N-point sequence, rn[] is zero outside the interval [-(N-1),N-1]. • rn[] is even function of • rn[0] ≥ rn[] • rn[0] – energy of sn[m] ⇒ • Ifsn[m] is a segment of a periodic sequence, then rn[] is periodic-like with the same period: • Because sn[m] is short-time, the overlapping data in the correlation decreases as increases ⇒ • Amplitude of rn[] decreases as increases; • With rectangular window the envelope of rn[] decreases linearly. • If sn[m] is a random white noise sequence, then rn[] is impulse-like, reflecting self-similarity only within a small negihbourhood. Veton Këpuska
Autocorrelation Method Veton Këpuska
Autocorrelation Method • Letting n[i,k] = rn[i-k], normal equation take the form: • The expression represents p linear equations with p unknowns, k for 1≤k≤p. • Using the normal equation solution, it can be shown that the corresponding minimum mean-squared prediction error is given by: • Matrix form representation of normal equations: Rn=rn. Veton Këpuska
Autocorrelation Method • Expanded form: • The Rn matrix is Toepliz: • Symmetric about the diagonal • All elements of the diagonal are equal. • Matrix is invertible • Implies efficient solution. Rn rn Veton Këpuska
h[n] A[n] s[n] Example 5.3 • Consider a system with an exponentially decaying impulse response of the form h[n] = anu[n], with u[n] being the unit step function. • Estimate a using the autocorrelation method of linear prediction. Z Veton Këpuska
Example 5.3 • Apply N-point rectangular window [0,N-1] at n=0. • Compute r0[0] and r0[1]. • Using normal equations: Veton Këpuska
Example 5.3 • Minimum squared error (from slide 33) is thus (Exercise 5.5): • For 1st order predictor, as in this example here, prediction error sequence for the true predictor (i.e., 1 = a) is given by: • e[n]=s[n]-as[n-1]=[n](see example 5.1 presented earlier). Thus the prediction of the signal is exact except at the time origin. • This example illustrates that with enough data the autocorrelation method yields a solution close to the true single-pole model for an impulse input. Veton Këpuska
Limitations of the linear prediction model • When the underlying measured sequence is the impulse response of an arbitrary all-pole sequence, then autocorrelation methods yields correct result. • There are a number of speech sounds that even with an arbitrary long data sequence a true solution can not be obtained. • Consider a periodic sequence simulating a steady voiced sound formed by convolving a periodic impulse train p[n] with an all-pole impulse response h[n]. • Z-transform of h[n] is given by: Veton Këpuska
… h[n] P Limitations of the linear prediction model • Thus • Normal equations of this system are given by (see Exercise 5.7) • Where autocorrelation of h[n] is denoted by rh[]=h[]*h[-]. • Suppose now that the system is excited with an impulse train of the period P: Veton Këpuska
Limitations of the linear prediction model • Normal equations associated with s[n] (windowed over multiple pitch periods) for an order p predictor are given by: • It can be shown that rn[] is equal to periodically repeated replicas of rh[]:but with decreasing amplitude due to the windowing (Exercise 5.7). Veton Këpuska
Limitations of the linear prediction model • The autocorrelation function rn[] of the windowed signal s[n] can be thought of as “aliased” version of rh[] due to overlap which introduces distortion: • When aliasing is minor the two solutions are approximately equal. • Accuracy of this approximation decreases as the pitch period decreases (e.g., high pitch) due to increase in overlap of autocorrelation replicas repeated every P samples. Veton Këpuska
Limitations of the linear prediction model • Sources of error: • Aliasing increases with high pitched speakers (smaller pitch period P). • Signal is not truly periodic. • Speech not always all-pole. • Autocorrelation is a suboptimal solution. • Covariance method capable of giving optimal solution, however, is not guaranteed to converge when underlying signal does not follow an all-pole model. Veton Këpuska
The Levinson Recursion of the Autocorrelation method • Direct inversion method (Gaussian elimination):requires p3 multiplies and additions. • Levinson Recursion (1947): • Requires p2 multiplies and additions • Links directly to the concatenated lossless tube model (Chapter 4) and thus a mechanism for estimating the vocal tract area function from an all-pole-model estimation. Veton Këpuska
The Levinson Recursion of the Autocorrelation method • Step 1: for i=1,2,…,p • Step 2: • Step 3: • Step 4: end ki-partial correlation coefficients - PARCOR Veton Këpuska
The Levinson Recursion of the Autocorrelation method • It can be shown that on each iteration that the predictor coefficients k, can be written as solely functions of the autocorrelation coefficients (Exercise 5.11). • Desired transfer function is given by: • Gain A has yet to be determined. Veton Këpuska
Properties of the Levinson Recursion of the Autocorrelation method • Magnitude of partial correlation coefficients is less than 1: |ki|<1 for all i. • Condition under 1 is sufficient for stability; if all |ki|<1 then all roots of A(z) are inside the unit circle. • Autocorrelation Method gives a minimum-phase solution even when the actual system is mixed-phase. Veton Këpuska
Properties of the Levinson Recursion to Autocorrelation method • Reverse Levinson Recursion:How to obtain lower level model from higher ones? • Autocorrelation matching: Let rn[] be the autocorrelation of the speech signal s[n+m]w[m] and rh[] the autocorrelation of h[n]=-1{H(z)} then: rn[] = rh[] for ||≤p Veton Këpuska
Autocorrelation Method • Gain Computation: En – is the average minimum prediction error for the pth-order predictor. • If the energy in the all-pole impulse response h[m] equals the energy in the measurement sn[m] ⇒ • Squared gain equal to the minimum prediction error. Veton Këpuska