1 / 33

Pitch synchronous windowing is a critical part of many speech processing algorithms

Liner Predictive Pitch Synchronization Voiced speech detection, analysis and synthesis Jim Bryan Florida Institute of Technology ECE5525 Final Project December 7, 2010. Pitch synchronous windowing is a critical part of many speech processing algorithms

Download Presentation

Pitch synchronous windowing is a critical part of many speech processing algorithms

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Liner Predictive Pitch Synchronization Voiced speech detection, analysis and synthesisJim Bryan Florida Institute of Technology ECE5525 Final Project December 7, 2010

  2. Pitch synchronous windowing is a critical part of many speech processing algorithms • Homomorphic filtering, for example, is based on the principle that the pitch frequency may be “liftered” from the vocal tract response via simple subtraction • Linear prediction based signal reconstruction simpler with Pitch synchronous windowing • covariance method need pitch synchronous glottal closed portion of the speech.

  3. Window selection for overlap and add reconstruction • Bartlett, simple triangle • Hann raised cosine types • Hamming raised cosine types

  4. Bartlett window overlap and add response

  5. Hann overlap and add response

  6. Hamming window overlap and add response

  7. Blackman-Harris overlap and add response

  8. Window selection based on “spectral leakage” and frequency resolution

  9. Hann Window

  10. Hamming window

  11. Blackman-Harris

  12. Window over lap and add Frame Rate verses Frame length considerations

  13. Linear Prediction wide search pitch period estimation • Single 12th order all pole model • Voiced speech is contained within the sample window • Use inverse filtering to get glottal pulses • Take autocorrelation of the residual to determine pitch period

  14. Male speak “Moon”

  15. Female speaker “Moon”

  16. Male voice

  17. Female voice

  18. Male residual

  19. Female residual

  20. Autocorrelation of Male

  21. Autocorrelation of female

  22. Synthesize Male single model

  23. Male single model

  24. Female single model

  25. Pitch synchronous Processing • Segment speech waveform so that the frame length is 3 pitch periods. Make sure the window length is even. • Set the Hamming window length to frame length and the frame rate to ½ the frame length • Generate a 12 pole LP model for each frame. • Inverse filter each frame and save the AR model coefficients and residual in a matrix, where each row is a residual. • Take the autocorrelation of the residual of the frame. • Find the autocorrelation peak. • Determine the pitch period for each frame based on the autocorrelation of the residual of the frame. If the frame does not have a valid pitch period, determine if the frame is fricative or plosive. If the variance of the autocorrelation is low, the frame is fricative. Otherwise the frame is plosive. • Save the pitch period for each frame in a vector along with the peak of the autocorrelation as well as the fricative or plosive status. • Reconstruct the frame by filtering the residual with the AR coefficients, or synthesize the waveform by estimating the glottal pulse train, adding impulsive fricative noise or a single impulse for plosive frames. • Over lap and add segments to reconstruct the signal. • Compare to the original speech using SSE

  26. Overlap and add Reconstruction male

  27. Overlap and add female

  28. Overlap and add reconstruction male

  29. Overlap and add reconstruction female

  30. Reconstructed Male

  31. Reconstructed female

  32. Conclusions • Many speech processing applications use a combination of windowing and overlap and add for signal resonstruction • Pitch synchronous windowing necessary for accurate results in speech processing. Homomorphicdeconvolutionrequires it. • A single set of coefficients for a single voiced sound appears to be a reasonable approach • Pitch period estimation, is extracted from the residual of the inverse filtered voiced sound through the autocorrelation function • Pitch synchronous windowing a good foundation for all type of signal processing applications

More Related