1 / 12

THE FUNDAMENTAL FREQUENCY VARIATION SPECTRUM

THE FUNDAMENTAL FREQUENCY VARIATION SPECTRUM. FONETIK 2008 Kornel Laskowski , Mattias Heldner and Jens Edlund interACT , Carnegie Mellon University, Pittsburgh PA, USA Centre for Speech Technology, KTH Stockholm, Sweden. Speaker: Hsiao- Tsung. Introduction.

milton
Download Presentation

THE FUNDAMENTAL FREQUENCY VARIATION SPECTRUM

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. THE FUNDAMENTAL FREQUENCY VARIATION SPECTRUM FONETIK 2008 KornelLaskowski, MattiasHeldner and Jens Edlund interACT, Carnegie Mellon University, Pittsburgh PA, USA Centre for Speech Technology, KTH Stockholm, Sweden Speaker:Hsiao-Tsung

  2. Introduction • While speech recognition systems have long ago transitioned from formant localization to spectral (vector-valued) formant representations. • Prosodic processing continues to rely squarely on a pitch tracker’s ability to identify a peak, corresponding to the fundamental frequency(f0) of the speaker. • Even if a robust, local, analytic, statistical estimate of absolute pitch were available, applications require a representation of pitch variation and go to considerable additional effort to identify a speaker-dependent quantity for normalization

  3. The Fundamental Frequency Variation Spectrum • Instantaneous variation in pitch is normally computed by determining a single scalar, the F0, at two temporally adjacent instants and forming their difference.

  4. The Fundamental Frequency Variation Spectrum • we propose a vector-valued representation of pitch variation, inspired by vanishing-point perspective(透視) • While the standard inner productbetween two vectors can be viewed as thesummation of pair-wise products with pairs selectedby orthonormal projection onto a point atinfinity F: signal’s spectral content (512-point FFT)

  5. The Fundamental Frequency Variation Spectrum • the proposed vanishing-point productinduces a 1-point perspective projection onto apoint at

  6. The Fundamental Frequency Variation Spectrum • The FFV spectrum is then given by • is undefined over the interval [-T0, +T0]

  7. The Fundamental Frequency Variation Spectrum • A support for which is continuous over • In practice, we compute using magnitude rather than complex spectra

  8. The Fundamental Frequency Variation Spectrum • and are 512-point Fourier transforms, computed every 8 ms. • However, the discrete transforms FL and FR are in general not defind at the corresponding dilate frequencies . • We resort to linear interpolation using the coefficients

  9. The Fundamental Frequency Variation Spectrum Energy independent

  10. Filterbank slowly changing Rapidly changing

  11. Filterbank

  12. Discussion • Initial experiments along these lines show that such HMMs, when trained on dialogue data, corroborate research on human turn-taking behavior in conversations. • does not require peak identification, dynamic time warping, median filtering, landmark detection, linearization, or mean pitch estimation and subtraction • Immediate next steps include fine-tuning the filter banks and the HMM topologies, and testing the results on other tasks where pitch movements are expected to play a role, such as the attitudinal coloring of short feedback utterances, speaker verification, and automatic speech recognition for tonal languages.

More Related