1 / 25

A Corpus Study of Native and Non-native Accented Speech

A Corpus Study of Native and Non-native Accented Speech. Chen-huei Wu Department of Linguistics, UIUC. Outline of the presentation. Rationale and research questions Literature review Corpus and methodology Future analysis. Where does the impression of accent come from?.

spursley
Download Presentation

A Corpus Study of Native and Non-native Accented Speech

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Corpus Study of Native and Non-native Accented Speech Chen-huei Wu Department of Linguistics, UIUC

  2. Outline of the presentation • Rationale and research questions • Literature review • Corpus and methodology • Future analysis

  3. Where does the impression of accent come from? • On the production side, second language (L2) learners usually have some sort of difficulty in acquiring native-like speech performance. • On the perception side, it is interesting that human can recognize accents, and further recognize accents close to their native languages.

  4. Where does the impression of accent come from? • What is the speech patterns of accented speech? • What influences the perception of accents? • What matters more: mispronunciation ofvowels, consonants, grammar errors, word choice or speech rhythm?

  5. This study • Vowel production in Mandarin as an L2 • Spontaneous speech • Corpus study • Perception of accentedness

  6. Research Questions • What are the speech patterns of vowel production by L1 and L2 speakers in conversational speech? • What are the similarities and differences of vowel production between L1 and L2 speakers? • If the vowel production of L2 learners is off-target, does it affect all the vowels, or some of the vowels? • How do native listeners perceive L2 accents? • Do native listeners tolerate mispronunciation in some areas but not others?

  7. Goals • To find the hidden reasons behind the impression of accent • To identify critical acoustic variables that affect native listeners’ perceptions of accented speech

  8. Literature Review • Segmental level: phone acquisition (Anderson-Hsieh et. al., 1992; Koster and Koet, 1993; Flege 1995) • Suprasegmental level: stress timing, peak alignment, speech rate, pause frequency and pause duration (Munro, 1995; Trofovimich and Baker, 2006)

  9. This Study • Acoustic analysis and perceptual rating asking native speakers of Chinese to listen to speech samples and to judge the level of accentedness in the speech they hear. • Three speech communities: nativespeakers of Chinese, heritage speakers and Chinese learners

  10. Chinese learner's spontaneous speech corpus • More than 150 hours of speech • More than 100 speakers • 63 hours of turn-marking • 48 hours of transcriptions • 700 disfluency labels (4-hour speech) • 5200 phone labels in (20 minutes)

  11. Chinese learner's spontaneous speech corpus • Two type of classroom activities (Shih, 2006) • Variety show • Debate • Speech styles • Casual speech • Prepared speech

  12. Speech Material: the variety showfrom fall 2005-Spring 2008

  13. Procedure Classroom recording (video + audio) Annotation of Speaker Turn Transcription Random Selection Extracted Audio files Automated Phone Segmentation Manual Checking Data Analysis Perceptual Rating Data Analysis Data Analysis

  14. Annotation of speaker turn • The criteria for speaker turn-marking • No long overlap speech • No long silent pause • No continuous laughter or clapping • But, there might be still some noise in turned-marked utterances

  15. Procedure Classroom recording (video + audio) Annotation of Speaker Turn Transcription Random Selection Extracted Audio files Automated Phone Segmentation Manual Checking Data Analysis Perceptual Rating Data Analysis Data Analysis

  16. Random Selection • The utterances for each speaker will be filtered according to the following criteria: • At most 30 seconds long • At least 15 seconds long • Then, randomly selected from the filtered utterances • Therefore, the one-minute speech data for each speaker will be composed by 2-4 utterances.

  17. Procedure Classroom recording (video + audio) Annotation of Speaker Turn Transcription Random Selection Extracted Audio files Automated Phone Segmentation Manual Checking Data Analysis Perceptual Rating Data Analysis Data Analysis

  18. Automated Phone Segmentation • Jiahong Yuan's aligner • Yuan and Liberman (2008) from The Penn Phonetics Lab Forced Aligner (http://wms-609.sas.upenn.edu/research/alignment/align.htm) • The toolkit includes • models: the acoustic models, parameter files, and CMU pronunciation dictionary • align.py: a python script that automates the procedure of doing forced alignment

  19. Automated Phone Segmentation • Yuan’s aligner modified by Shih • In addition to the English dictionary, two Chinese dictionaries were added: • Master.big5.darpa Chinese-character Pinyin Pronunciation e.g 包 BAO B AW1 • Pinyindict Pinyin Pronunciation s.g BAO B AW1

  20. Test on laboratory speech

  21. Perceptual rating on accentedness • Untrained and linguistic naïve raters

  22. Perceptual rating on accentedness Questions 1. Is the speaker a native speaker or not? Native Non-native 2. How accented is the speech? 1 2 3 4 (4: accented; 1: no accent) 3. How fluent is the speech? 1 2 3 4 (4: fluent; 1: not fluent) 4. How comprehensible is the speech? 1 2 3 4 (4: comprehensive; 1 not comprehensive)

  23. Perceptual rating on accentedness

  24. Contribution • To provide the missing link between quantitative acoustic analysis andthe impression of accent.

  25. Thank you!

More Related