1 / 20

Entropy and Dynamism Criteria for Voice Quality Classification Applications

VOICE QUALITY: FUNCTIONS, ANALYSIS AND SYNTHESIS GENEVA - AUGUST 27-29, 2003 ISCA Tutorial and Research Workshop International Speech Communication Association. Entropy and Dynamism Criteria for Voice Quality Classification Applications.

vashon
Download Presentation

Entropy and Dynamism Criteria for Voice Quality Classification Applications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. VOICE QUALITY: FUNCTIONS, ANALYSIS AND SYNTHESISGENEVA - AUGUST 27-29, 2003ISCA Tutorial and Research WorkshopInternational Speech Communication Association Entropy and Dynamism Criteria for Voice Quality Classification Applications Authors: Peter D. Kukharchik, Igor E. Kheidorov, Hanna M. Lukashevich, Denis L. Mitrofanov Belarusian State University, Radiophysics Department, Minsk, Belarus

  2. VOICE QUALITY: FUNCTIONS, ANALYSIS AND SYNTHESISISCA Tutorial and Research WorkshopInternational Speech Communication Association Voice Quality Classification Applications • Introduction • System design • Experiment • Conclusion

  3. Introduction VOICE QUALITY: FUNCTIONS, ANALYSIS AND SYNTHESISISCA Tutorial and Research WorkshopInternational Speech Communication Association • Audio is a large and extremely variable data class. • The range of sounds is large, from music genres to animal cries to synthesizer samples. • Any of the above can and will occur in combination.

  4. VOICE QUALITY: FUNCTIONS, ANALYSIS AND SYNTHESISISCA Tutorial and Research WorkshopInternational Speech Communication Association Existing Approaches • Signal Processing Techniques • Spectrum • Modulation spectrum • Temporal Information • Decision Making • Bayesian Information Criterion (BIC) • Log Likelihood Ratio • Hidden Markov Model (HMM)

  5. VOICE QUALITY: FUNCTIONS, ANALYSIS AND SYNTHESISISCA Tutorial and Research WorkshopInternational Speech Communication Association Block diagram of the proposed system Input Data (Wave file) Feature vector extraction Neural network Entropy & Dynamism HMM Segments Vectors (Mel Cepstra) Probability of Russian phonemes Entropy and Dynamism

  6. VOICE QUALITY: FUNCTIONS, ANALYSIS AND SYNTHESISISCA Tutorial and Research WorkshopInternational Speech Communication Association Definitions Entropy and averaged entropyEntropy is measure of the uncertainty or disorder in a given distribution We use N=40

  7. VOICE QUALITY: FUNCTIONS, ANALYSIS AND SYNTHESISISCA Tutorial and Research WorkshopInternational Speech Communication Association Definitions Dynamism and average dynamismDynamism is a measure of the rate of change of a quantity

  8. VOICE QUALITY: FUNCTIONS, ANALYSIS AND SYNTHESISISCA Tutorial and Research WorkshopInternational Speech Communication Association Feature Vectors extraction We use 12 Mel Cepstra coefficients in 30ms window with shifting of frame 10ms, for 4-15min wave files of russian speech, non-russian speech and music.

  9. VOICE QUALITY: FUNCTIONS, ANALYSIS AND SYNTHESISISCA Tutorial and Research WorkshopInternational Speech Communication Association Hidden Markov Model • HMM • Define HMM for signal – one HMM state for every segment we want to find • Perform a Viterbi search of an optimal path using probabilities from previous step • Determine segment boundaries as a moments of HMM states change S3 S4 S2 S5 S1 HMM S6 S0

  10. VOICE QUALITY: FUNCTIONS, ANALYSIS AND SYNTHESISISCA Tutorial and Research WorkshopInternational Speech Communication Association Neural Network • Neural network for probabilities generation : grounds • Neural networks can model probabilities distribution with a high accuracy due to their ability to approximate a large variety of functions • If training neural network doesn’t stop in local minimum • the outputs can be considered as classes probabilities

  11. VOICE QUALITY: FUNCTIONS, ANALYSIS AND SYNTHESISISCA Tutorial and Research WorkshopInternational Speech Communication Association Mutilayer Perceptron • Neural network for probabilities generation : structure • Fully connected mutilayer perceptron • Input layer size equals to feature vector size • Output layer size equals to probability of phonemes • Number and sizes of hidden layers varies • Tangent activation for hidden neurons • Softmax activation for output neurons

  12. VOICE QUALITY: FUNCTIONS, ANALYSIS AND SYNTHESISISCA Tutorial and Research WorkshopInternational Speech Communication Association Results Music Entropy histogram

  13. VOICE QUALITY: FUNCTIONS, ANALYSIS AND SYNTHESISISCA Tutorial and Research WorkshopInternational Speech Communication Association Results - Russian Speech

  14. VOICE QUALITY: FUNCTIONS, ANALYSIS AND SYNTHESISISCA Tutorial and Research WorkshopInternational Speech Communication Association Results - Foreign

  15. VOICE QUALITY: FUNCTIONS, ANALYSIS AND SYNTHESISISCA Tutorial and Research WorkshopInternational Speech Communication Association Results - Russian and Foreign Blue is Russian, pink is French

  16. VOICE QUALITY: FUNCTIONS, ANALYSIS AND SYNTHESISISCA Tutorial and Research WorkshopInternational Speech Communication Association Results Russian speaker (blue) and Music (pink) Two Russian speakers (blue and brown) and Music (others)

  17. VOICE QUALITY: FUNCTIONS, ANALYSIS AND SYNTHESISISCA Tutorial and Research WorkshopInternational Speech Communication Association Results Pure Russian & “Czech” Russian There some difference even between native speech and Russian with Czech accent

  18. VOICE QUALITY: FUNCTIONS, ANALYSIS AND SYNTHESISISCA Tutorial and Research WorkshopInternational Speech Communication Association Results Entropy histograms of “normal” (brown) and “rough” (blue) French speech

  19. VOICE QUALITY: FUNCTIONS, ANALYSIS AND SYNTHESISISCA Tutorial and Research WorkshopInternational Speech Communication Association Results Entropy histograms for “normal” (brown), “rough” (blue) and “lips” (lips) French speech

  20. VOICE QUALITY: FUNCTIONS, ANALYSIS AND SYNTHESISISCA Tutorial and Research WorkshopInternational Speech Communication Association Conclusion • Further research • Parameter vectors, their size, number of context frames • Specialized HMM structures for a certain type of speech signals • Conclusion • Entropy and Dynamism features, as experiments show, can be successfully used for automatic signal segmentation. Further research in this area can lead to better practical results.

More Related