1 / 67

PATTERN COMPARISON TECHNIQUES

PATTERN COMPARISON TECHNIQUES. Test Pattern:. Reference Pattern:. 4.2 SPEECH (ENDPIONT) DETECTION. 4.3 DISTORTION MEASURES-MATHEMATICAL CONSIDERATIONS. x and y: two feature vectors defined on a vector space X. The properties of metric or distance function d:.

Download Presentation

PATTERN COMPARISON TECHNIQUES

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PATTERN COMPARISON TECHNIQUES Test Pattern: Reference Pattern:

  2. 4.2 SPEECH (ENDPIONT) DETECTION

  3. 4.3 DISTORTION MEASURES-MATHEMATICAL CONSIDERATIONS x and y: two feature vectors defined on a vector space X The properties of metric or distance function d: A distance function is called invariant if

  4. PERCEPTUAL CONSIDERATIONS Spectral changes that do not fundamentally change the perceived sound include:

  5. PERCEPTUAL CONSIDERATIONS Spectral changes that lead to phonetically different sounds include:

  6. PERCEPTUAL CONSIDERATIONS Just-discriminable change: known as JND (just-noticeable difference), DL (difference limen), or differential threshold

  7. 4.4 DISTORTION MEASURES-PERCEPTUAL CONSIDERATIONS

  8. 4.4 DISTORTION MEASURES-PERCEPTUAL CONSIDERATIONS

  9. Spectral Distortion Measures Spectral Density Fourier Coefficients of Spectral Density Autocorrelation Function

  10. Spectral Distortion Measures Short-term autocorrelation Then is an energy spectral density

  11. Spectral Distortion Measures Autocorrelation matrices

  12. Spectral Distortion Measures If σ/A(z) is the all-pole model for the speech spectrum, The residual energy resulting from “inverse filtering” the input signal with an all-zero filter A(z) is:

  13. Spectral Distortion Measures Important properties of all-pole modeling: The recursive minimization relationship:

  14. LOG SPECTRAL DISTANCE

  15. LOG SPECTRAL DISTANCE

  16. CEPSTRAL DISTANCES The complex cepstrum of a signal is defined as The Fourier transform of log of the signal spectrum.

  17. CEPSTRAL DISTANCES Truncated cepstral distance

  18. CEPSTRAL DISTANCES

  19. CEPSTRAL DISTANCES

  20. Weighted Cepstral Distances and Liftering • It can be shown that under certain regular conditions, the cepstral coefficients, except c0, have: • Zero means • Variances essentially inversed proportional to the square of the coefficient • index: If we normalize the cepstral distance by the variance inverse:

  21. Weighted Cepstral Distances and Liftering Differentiating both sides of the Fourier series equation of spectrum: This is an L2 distance based upon the differences between the spectral slopes

  22. Cepstral Weighting or Liftering Procedure h is usually chosen as L/2 and L is typically 10 to 16

  23. A useful form of weighted cepstral distance:

  24. Likelihood Distortions Previously defined: Itakura-Saito distortion measure Where and are one-step prediction errors of and as defined:

  25. Likelihood Distortions The residual energy can be easily evaluated by:

  26. Likelihood Distortions By replacing by its optimal p-th order LPC model spectrum: If we set σ2 to match the residual energy α : Which is often referred to as Itakura distortion measure

  27. Likelihood Distortions Another way to write the Itakura distortion measure is: Another gain-independent distortion measure is called the Likelihood Ratio distortion:

  28. 4.5.4 Likelihood Distortions

  29. 4.5.4 Likelihood Distortions That is, when the distortion is small, the Itakura distortion measure is not very different from the LR distortion measure

  30. 4.5.4 Likelihood Distortions

  31. 4.5.4 Likelihood Distortions Consider the Itakura-Saito distortion between the input and output of a linear system H(z)

  32. 4.5.4 Likelihood Distortions

  33. 4.5.4 Likelihood Distortions

  34. 4.5.5 Variations of Likelihood Distortions Symmetric distortion measures:

  35. 4.5.5 Variations of Likelihood Distortions COSH distortion

  36. 4.5.5 Variations of Likelihood Distortions

  37. 4.5.6 Spectral Distortion Using a Warped Frequency Scale Psychophysical studies have shown that human perception of the frequency Content of sounds does not follow a linear scale. This research has led to the idea of defining subjective pitch of pure tones. For each tone with an actual frequency, f, measured in Hz, a subjective pitch is measured on a scale called the “mel” scale. As a reference point, the pitch of a 1 kHz tone, 40 dB above the perceptual hearing threshold, is defined as 1000 mels.

  38. 4.5.6 Spectral Distortion Using a Warped Frequency Scale

  39. 4.5.6 Spectral Distortion Using a Warped Frequency Scale

  40. 4.5.6 Spectral Distortion Using a Warped Frequency Scale

  41. Examples of Critical bandwidth

  42. Warped cepstral distance b is the frequency in Barks, S(θ(b)) is the spectrum on a Bark scale, and B is the Nyquist frequency in Barks.

  43. 4.5.6 Spectral Distortion Using a Warped Frequency Scale Where the warping function is defined by

  44. 4.5.6 Spectral Distortion Using a Warped Frequency Scale

  45. 4.5.6 Spectral Distortion Using a Warped Frequency Scale

  46. 4.5.6 Spectral Distortion Using a Warped Frequency Scale

  47. 4.5.6 Spectral Distortion Using a Warped Frequency Scale Mel-frequency cepstrum: is the output power of the triangular filters Mel-frequency cepstral distance

  48. 4.5.7 Alternative Spectral Representations and Distortion Measures

More Related