370 likes | 626 Views
Zurich, 29-01-2007. 2. Context. Deviant' pronunciation (e.g., pathology, non-natives)
E N D
1. Using speech technology for pronunciation assessment and trainingHelmer StrikCentre for Language and Speech Technology (CLST)Radboud University Nijmegen, the Netherlands
2. Zurich, 29-01-2007 2 Context Deviant pronunciation (e.g., pathology, non-natives)
& speech technology Applications :
AAC (Augmentative & Alternative Communication)
Improve communication
Interactive tools
Reading, listening
Assessment
Diagnosis, monitoring
Training (therapy, learning)
CAPT: Computer Assisted Pronunciation Training
3. Zurich, 29-01-2007 3 Overview
Contents :
CAPT
Error detection
4. Zurich, 29-01-2007 4 CAPT: Background and problem Computer Assisted Pronunciation Training (CAPT)
ASR-based CAPT:
can provide automatic, instantaneous, individual feedback on pronunciation in a private environment
But ASR-based CAPT suffers from limitations.
Is it effective in improving L2 pronunciation?
Very few studies with different results.
5. Zurich, 29-01-2007 5 CAPT: Goal of this study To study the effectiveness and possible advantage of automatic feedback provided by an ASR-based CAPT system.
6. Zurich, 29-01-2007 6 ASR-based CAPT system: Dutch CAPT Target users
adult learners of Dutch with different L1's
(e.g. immigrants)
Pedagogical goal
improving segmental quality in pronunciation
7. Zurich, 29-01-2007 7 Dutch CAPT: feedback Content: focus on problematic phonemes
Criteria
Common across speakers of various L1s
Perceptually salient
Frequent
Persistent
Robust for automatic detection
Result:
11 targeted phonemes : 9 vowels and 2 consonants
8. Zurich, 29-01-2007 8 Video (from Nieuwe Buren)
9. Zurich, 29-01-2007 9
10. Zurich, 29-01-2007 10 Video: dialogue
11. Zurich, 29-01-2007 11
12. Zurich, 29-01-2007 12
13. Zurich, 29-01-2007 13
14. Zurich, 29-01-2007 14
15. Zurich, 29-01-2007 15 Dutch CAPT Gender-specific, Dutch & English version.
4 units, each containing:
1 video (from Nieuwe Buren) with real-life + amusing situations
+ ca. 30 exercises based on video: dialogues, question-answer, minimal pairs, word repetition
Sequential, constrained navigation: min. one attempt needed to proceed to next exercise, maximum 3
16. Zurich, 29-01-2007 16 Method: participants & training Regular teacher-fronted lessons: 4-6 hrs per week
Experimental group (EXP): n=15 (10 F, 5 M) Dutch CAPT
Control group 1 (NiBu): n=10 (4 F, 6 M) reduced version of Nieuwe Buren
Control group 2 (noXT): n=5 (3 F, 2 M) no extra training
Extra training: 4 weeks x 1 session 30 60
1 class 1 type of training
17. Zurich, 29-01-2007 17 Method: testing 3 analyses:
Participants evaluations: questionnaires on systems usability, accessibility, usefulness etc.
Global segmental quality: 6 experts rated stimuli on 10-point scale (pretest/posttest, phonetically balanced sentences)
In-depth analysis of segmental errors: expert annotations
18. Zurich, 29-01-2007 18 Results: participants evaluations Positive reactions
Enjoyed working with the system
Believed in the usefulness of the system
19. Zurich, 29-01-2007 19 Results: reliability global ratings Cronbachs a:
Intrarater: 0.94 1.00
Interrater: 0.83 - 0.96
20. Zurich, 29-01-2007 20 Results: Global ratings
21. Zurich, 29-01-2007 21
22. Zurich, 29-01-2007 22 Results: Global ratings
23. Zurich, 29-01-2007 23 In-depth analysis segm. quality
24. Zurich, 29-01-2007 24 Conclusions Global ratings are appropriate measure because CAPT should ultimately improve overall pronunciation quality.
Fine-grained analyses also useful.
Participants enjoyed Dutch CAPT.
ASR-CAPT seems efficacious in improving pronunciation of targeted phonemes.
25. Zurich, 29-01-2007 25 Video: pronouncing words
26. Zurich, 29-01-2007 26 Possible improvements Increase sample size (more participants)
Increase training intensity (more training)
Match training groups: L1s, proficiency, etc.
Give feedback on more phonemes
More targeted systems for fixed L1-L2 pairs.
Give feedback on suprasegmentals
Improve error detection?
27. Zurich, 29-01-2007 27 Error detection
Detection of pronunciation errors
Goodness Of Pronunciation (GOP)
Silke Witt & Steve Young
Acoustic-phonetic approaches
Truong et al.
Goal: improve error detection
28. Zurich, 29-01-2007 28
29. Zurich, 29-01-2007 29 GOP: Accuracy 15 participants
2174 target phones
Results
30. Zurich, 29-01-2007 30 Acoustic-phonetic approach
Selection of segmental pronunciation errors:
/A/ mispronounced as /a:/ (man - maan)
/Y/ mispronounced as /u/ or /y/ (tut toet or tuut)
/x/ mispronounced as /k/ or /g/ (gat kat or /g/at) Before we started, we first selected a number of pronunciation errors which we were going to address in this study. A survey was carried out on an annotated non-native speech database, and we selected pronunciation errors by their frequency, and we selected only gross errors.
FrequentBefore we started, we first selected a number of pronunciation errors which we were going to address in this study. A survey was carried out on an annotated non-native speech database, and we selected pronunciation errors by their frequency, and we selected only gross errors.
Frequent
31. Zurich, 29-01-2007 31 Here are some examples of amplitude and ROR contours of the two sounds. At the top: amplitude, at the bottom: ROR contour. At the left: fricative /x/, at the right: plosive /k/. We see indeed a gradual rise of amplitude for the fricative and an abrupt rise of amplitude for the plosive. The abrupt rise of amplitude is clearly visible as a high peak in the ROR contour in the case of the plosive; in the case of the fricative this high peak is missing. We are mainly going to use these amplitude differences to discriminate /x/ from /k/.Here are some examples of amplitude and ROR contours of the two sounds. At the top: amplitude, at the bottom: ROR contour. At the left: fricative /x/, at the right: plosive /k/. We see indeed a gradual rise of amplitude for the fricative and an abrupt rise of amplitude for the plosive. The abrupt rise of amplitude is clearly visible as a high peak in the ROR contour in the case of the plosive; in the case of the fricative this high peak is missing. We are mainly going to use these amplitude differences to discriminate /x/ from /k/.
32. Zurich, 29-01-2007 32 Here we can see what kind of measurement we have taken to train the classifiers. First, the height of the highest ROR peak. Then 4 amplitude measurements: 1 before the peak, and 3 after the peak (chosen rather arbitrarily), and also duration which was added optionally.Here we can see what kind of measurement we have taken to train the classifiers. First, the height of the highest ROR peak. Then 4 amplitude measurements: 1 before the peak, and 3 after the peak (chosen rather arbitrarily), and also duration which was added optionally.
33. Zurich, 29-01-2007 33 Results method II (LDA)/x/ vs /k/
34. Zurich, 29-01-2007 34 Error detection GOP:
One general method for all sounds
Error specific knowledge is not used
Acoustic-phonetic approach
Error specific knowledge is used
Works well
How to generalize? (artic. + other features)
Combination?
Other approaches, e.g. post. probs (ANN)?
35. Zurich, 29-01-2007 35
36. Zurich, 29-01-2007 36 Error detection Pronunciation errors
11 problematic sounds : 9 V + 2 C
Goal: give feedback on more sounds
Morpho-syntactic errors
maak / maakt / maken
Ik maak
Hij/zij maakt
Wij maken
Goal: also give feedback on morpho-syntactic aspects
37. Zurich, 29-01-2007 37 GOP GOP has been applied in the exp. system.
The exp. system was effective.
Evaluate GOP
Correct vs. errors
Patterns
Pros & cons
Improve