680 likes | 856 Views
PATTERN COMPARISON TECHNIQUES. Test Pattern:. Reference Pattern:. 4.2 SPEECH (ENDPIONT) DETECTION. 4.3 DISTORTION MEASURES-MATHEMATICAL CONSIDERATIONS. x and y: two feature vectors defined on a vector space X. The properties of metric or distance function d:.
E N D
PATTERN COMPARISON TECHNIQUES Test Pattern: Reference Pattern:
4.3 DISTORTION MEASURES-MATHEMATICAL CONSIDERATIONS x and y: two feature vectors defined on a vector space X The properties of metric or distance function d: A distance function is called invariant if
PERCEPTUAL CONSIDERATIONS Spectral changes that do not fundamentally change the perceived sound include:
PERCEPTUAL CONSIDERATIONS Spectral changes that lead to phonetically different sounds include:
PERCEPTUAL CONSIDERATIONS Just-discriminable change: known as JND (just-noticeable difference), DL (difference limen), or differential threshold
Spectral Distortion Measures Spectral Density Fourier Coefficients of Spectral Density Autocorrelation Function
Spectral Distortion Measures Short-term autocorrelation Then is an energy spectral density
Spectral Distortion Measures Autocorrelation matrices
Spectral Distortion Measures If σ/A(z) is the all-pole model for the speech spectrum, The residual energy resulting from “inverse filtering” the input signal with an all-zero filter A(z) is:
Spectral Distortion Measures Important properties of all-pole modeling: The recursive minimization relationship:
CEPSTRAL DISTANCES The complex cepstrum of a signal is defined as The Fourier transform of log of the signal spectrum.
CEPSTRAL DISTANCES Truncated cepstral distance
Weighted Cepstral Distances and Liftering • It can be shown that under certain regular conditions, the cepstral coefficients, except c0, have: • Zero means • Variances essentially inversed proportional to the square of the coefficient • index: If we normalize the cepstral distance by the variance inverse:
Weighted Cepstral Distances and Liftering Differentiating both sides of the Fourier series equation of spectrum: This is an L2 distance based upon the differences between the spectral slopes
Cepstral Weighting or Liftering Procedure h is usually chosen as L/2 and L is typically 10 to 16
A useful form of weighted cepstral distance:
Likelihood Distortions Previously defined: Itakura-Saito distortion measure Where and are one-step prediction errors of and as defined:
Likelihood Distortions The residual energy can be easily evaluated by:
Likelihood Distortions By replacing by its optimal p-th order LPC model spectrum: If we set σ2 to match the residual energy α : Which is often referred to as Itakura distortion measure
Likelihood Distortions Another way to write the Itakura distortion measure is: Another gain-independent distortion measure is called the Likelihood Ratio distortion:
4.5.4 Likelihood Distortions That is, when the distortion is small, the Itakura distortion measure is not very different from the LR distortion measure
4.5.4 Likelihood Distortions Consider the Itakura-Saito distortion between the input and output of a linear system H(z)
4.5.5 Variations of Likelihood Distortions Symmetric distortion measures:
4.5.5 Variations of Likelihood Distortions COSH distortion
4.5.6 Spectral Distortion Using a Warped Frequency Scale Psychophysical studies have shown that human perception of the frequency Content of sounds does not follow a linear scale. This research has led to the idea of defining subjective pitch of pure tones. For each tone with an actual frequency, f, measured in Hz, a subjective pitch is measured on a scale called the “mel” scale. As a reference point, the pitch of a 1 kHz tone, 40 dB above the perceptual hearing threshold, is defined as 1000 mels.
Examples of Critical bandwidth
Warped cepstral distance b is the frequency in Barks, S(θ(b)) is the spectrum on a Bark scale, and B is the Nyquist frequency in Barks.
4.5.6 Spectral Distortion Using a Warped Frequency Scale Where the warping function is defined by
4.5.6 Spectral Distortion Using a Warped Frequency Scale Mel-frequency cepstrum: is the output power of the triangular filters Mel-frequency cepstral distance
4.5.7 Alternative Spectral Representations and Distortion Measures