1 / 20

Tonal Speech without Pitch

Tonal Speech without Pitch. Jerry Zhu zhuxj@cs.cmu.edu 2003/7/3. What’s in your mouth. Tony Robinson, http://mi.eng.cam.ac.uk/~ajr/SA95/node15.html. MFCC. Tony Robinson, http://mi.eng.cam.ac.uk/~ajr/SA95/node15.html. * Focus on vocal tract shape (e.g. different vowels) * No pitch.

enid
Download Presentation

Tonal Speech without Pitch

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Tonal Speech without Pitch Jerry Zhu zhuxj@cs.cmu.edu 2003/7/3

  2. What’s in your mouth Tony Robinson, http://mi.eng.cam.ac.uk/~ajr/SA95/node15.html

  3. MFCC Tony Robinson, http://mi.eng.cam.ac.uk/~ajr/SA95/node15.html * Focus on vocal tract shape (e.g. different vowels) * No pitch

  4. Tonal languages • Tone: variation in pitch. e.g. Mandarin, Thai http://kca.org/education/ImageView.asp?ImageID=179

  5. MFCC disastrous for tones? • MFCC should have no pitch info. • Bad for Mandarin speech recognition? Not really why?

  6. Hypothesis 1 • Language context helps a lot? • e.g. singing over-rides pitch • people *do* understand the lyric (sort of)

  7. Hypothesis 2 • MFCC retains some pitch? • by imperfection • residual pitch info used by speech recognizers • Test: convert MFCC to speech, listen for tones. (TBD)

  8. Hypothesis 3 • Do we really need pitch to perceive tones? • Test: whispered speech • Can native speakers perceive tones in whispered speech? Tony Robinson, http://mi.eng.cam.ac.uk/~ajr/SA95/node15.html

  9. Minimum pairs • A minimum pair: two 2-char words with only 1 tonal difference. • Why not use • one-char words: to prevent over-articulating • multi-char words: hard to find min pairs.

  10. Listener listens for the ORDERwithin each minimum pair Whisperer file Listener file

  11. Experiment setup • Each whisperer/listener group work on about 100 different minimum pairs. • In a quiet room, 1 meter apart. Each pair whispered once. • Native speakers. (Liu J., Yu H., Zhang Y., Zhu X.)

  12. What to expect • If there is no tonal info in whisper, listeners would guess the order with 50% accuracy.

  13. Result

  14. Result significant? • Flip a coin 3 times, 2 heads 1 tail. A biased coin? • Chi-square test • Accuracy significantly better than random at p < 0.0001 (that’s *really* significant).

  15. Accuracy breakdown . correct/total .

  16. Accuracy breakdown . Accuracy %, significant at p<0.002 .

  17. Summary • People do perceive tonal differences without pitch. • How? • Strength (power)? • Duration? • Subtle vocal tract shape difference?

  18. While we are whispering... • Tonal difference (we’ve seen that) • Voiced / unvoiced consonant? time vs. dime • voice onset time http://www.indiana.edu/~hlw/PhonUnits/consonants2.html

  19. Voiced/unvoiced consonant • [p,b], [t,d], [k,g] • Mandarin speakers 94% accuracy • Aspiration

  20. Other languages? • Thai • Is tonal too; 5 tones. • Has [ph], [p], [b] would be interesting!

More Related