1 / 28

Methods for Efficient Semi-Automatic Pronunciation Dictionary Bootstrapping

Methods for Efficient Semi-Automatic Pronunciation Dictionary Bootstrapping. Tim Schlippe, 18 August 2014. Outline. Introduction Our semi-automatic pronunciation generation strategy Experiments Analysis of intervals for G2P ( grapheme - to -phoneme) model updates

Download Presentation

Methods for Efficient Semi-Automatic Pronunciation Dictionary Bootstrapping

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Methods for Efficient Semi-Automatic Pronunciation Dictionary Bootstrapping Tim Schlippe, 18 August 2014

  2. Outline • Introduction • Our semi-automatic pronunciation generation strategy • Experiments • Analysis ofintervalsfor G2P (grapheme-to-phoneme) modelupdates • Useofsingle G2P converters • Grapheme-to-phoneme convertercombination • Methodsfor initial grapheme-to-phonemeconvertertrainingdata • Conclusion and future work Tim Schlippe: Methods for Efficient Semi-Automatic Pronunciation Dictionary Bootstrapping

  3. Motivation • Pronunciation dictionaries needed for text-to-speech and automatic speech recognition (ASR) • Manual production of pronunciations slow and costly • e.g. 19.2–30s / word for Afrikaans(Davel and Barnard, 2004) • Semi-automatic approaches reduce human effort but • are still time-consuming. • start with an empty dictionary.… and initial grapheme-to-phoneme (G2P) training data is generated manually • usually one single favorite G2P converter is used. Tim Schlippe: Methods for Efficient Semi-Automatic Pronunciation Dictionary Bootstrapping

  4. Semi-automatic pronunciation generation • Manual pronunciation generation supported with estimated pronunciation • Data-driven G2P converter for producing estimated pronunciation iteratively updated Tim Schlippe: Methods for Efficient Semi-Automatic Pronunciation Dictionary Bootstrapping

  5. Semi-automatic pronunciation generation 1. Word-pronunciation pairs to train initial G2P converters. Tim Schlippe: Methods for Efficient Semi-Automatic Pronunciation Dictionary Bootstrapping

  6. Semi-automatic pronunciation generation 2. A word is determined for pronunciation generation. Tim Schlippe: Methods for Efficient Semi-Automatic Pronunciation Dictionary Bootstrapping

  7. Semi-automatic pronunciation generation 3. Pronunciations ofthatwordaregenerated (by multiple G2P models). Tim Schlippe: Methods for Efficient Semi-Automatic Pronunciation Dictionary Bootstrapping

  8. Semi-automatic pronunciation generation 4. Phoneme-level combination (PLC) of different G2P converter outputs. Tim Schlippe: Methods for Efficient Semi-Automatic Pronunciation Dictionary Bootstrapping

  9. Semi-automatic pronunciation generation 5. Display pronunciation to the user. Tim Schlippe: Methods for Efficient Semi-Automatic Pronunciation Dictionary Bootstrapping

  10. Semi-automatic pronunciation generation 6. User edits displayed pronunciation to the correct pronunciation. Tim Schlippe: Methods for Efficient Semi-Automatic Pronunciation Dictionary Bootstrapping

  11. Semi-automatic pronunciation generation 7. Word and corrected pronunciation are added to the dictionary. Tim Schlippe: Methods for Efficient Semi-Automatic Pronunciation Dictionary Bootstrapping

  12. Semi-automatic pronunciation generation 8. After a certain number of words, the G2P converters update their grapheme-to-phoneme models. Tim Schlippe: Methods for Efficient Semi-Automatic Pronunciation Dictionary Bootstrapping

  13. Experimental Setup • Languages • English, German, Spanish, Vietnamese, Swahili, Haitian Creole • Differing in their grade of G2P relationship • Reference dictionaries from CMU, GlobalPhone (Schultz&Schlippe, 2014) • G2P converters • Sequitur G2P (Bisani&Ney, 2008), • Phonetisaurus(Novak et al., 2012), • Default&Refine(Davel, 2005), • CART tree (Lenzo, 1997) • Phoneme-level combination (PLC) (Schlippe et al., 2014) Tim Schlippe: Methods for Efficient Semi-Automatic Pronunciation Dictionary Bootstrapping

  14. Experimental Setup • G2P model update interval (CPU time vs. G2P quality) • Fixed (Update after fixed number of new pronunciations) • Linear (linearlygrowingintervals) • Adaptive • First: after eachword, later:after certain number of predicted pronunciations neededcorrections • Our tradeoff: Logistic growth • Build G2P models frequently for small dictionaries… where generation is cheap and gain is high • Stable G2P models for larger dictionaries are updated less often… while still obeying an upper bound for updates Tim Schlippe: Methods for Efficient Semi-Automatic Pronunciation Dictionary Bootstrapping

  15. Experimental Setup • Evaluated different sources for initial pronunciations • Pronunciations derived from 1:1 G2P mapping (most commonly associated phoneme for each grapheme) • Web-driven G2P converters /Web-derived Pronunciations (Wiktionary) Tim Schlippe: Methods for Efficient Semi-Automatic Pronunciation Dictionary Bootstrapping

  16. Evaluation Metric • Human editing effort measured by • Total number of edits… not comparable between different dictionary sizes or languages • Cumulated phoneme error rate as a normed value: … from beginning of editing process to current word wi… comparable across languages and dictionary sizes • Human effort is analyzed in simulated experiments • … assuming that the developer changes the displayed phoneme sequence to thereference Tim Schlippe: Methods for Efficient Semi-Automatic Pronunciation Dictionary Bootstrapping

  17. Experiments: Single G2Pconverters • Single G2P converters on 10k dictionaries (cPER) • Languages with closer G2P relationship  less effort • Sequiturperformsbest in three of six cases, closely followed by Phonetisaurus • Phonetisaurus, Default&Refineand CART tree perform best in one case each Tim Schlippe: Methods for Efficient Semi-Automatic Pronunciation Dictionary Bootstrapping

  18. Experiments: G2P converter combination • Phoneme-level combination (PLC) • cPER • Always lower user editing effort • On average: 15% relative reduction • Between 1.9% and 38.2% relative depending on the language Tim Schlippe: Methods for Efficient Semi-Automatic Pronunciation Dictionary Bootstrapping

  19. Experiments: Initial Pronunciations • Initial pronunciations • derived from 1:1 G2P mapping, Web-derived pronunciations (WDP) • included in phoneme level combination (PLC) • leave out after performance without them is better(cross-over based on average of all tested languages) • 1:1 G2P mapping provides benefit in the beginning • Wiktionary speeds up whole generation Tim Schlippe: Methods for Efficient Semi-Automatic Pronunciation Dictionary Bootstrapping

  20. Experiments: Initial Pronunciations • Initial pronunciations • derived from 1:1 G2P mapping, Web-derived pronunciations (WDP) • included in phoneme level combination (PLC) • Still 6% reduction on top of PLC with WDP • 1:1 G2P mapping only 3% on top of PLC Tim Schlippe: Methods for Efficient Semi-Automatic Pronunciation Dictionary Bootstrapping

  21. Conclusion and Future Work • Semi-automatic pronunciation generation strategy • Evaluated with languages with different G2P relationship • 15% effort reduction with phoneme-level combination • 6% on top with Web-driven G2P converters • Cross-lingual G2P conversion (Schlippe at al., 2013) for initial pronunciations gave only slight improvement • Future work: • Integrate additional G2P converters • Improve phoneme-level combination • Testingour semi-automaticpronunciationgenerationstrategywith real human developers Tim Schlippe: Methods for Efficient Semi-Automatic Pronunciation Dictionary Bootstrapping

  22. Thank you! Tim Schlippe: Methods for Efficient Semi-Automatic Pronunciation Dictionary Bootstrapping

  23. References

  24. References

  25. Related Work • Semi-Automatic dictionary generation is a common approach • Rapid Language Adaptation Toolkit (RLAT) / SPICE (Schultz et al., 2007), DictionaryMaker(Davel and Barnard, 2004), A. Black, J. Kominek, … (Kominek, 2006) (Maskey et al., 2004) • Different single G2P converters utilized • Words ordered by occurrence or grapheme coverage • G2P models updated after each word, in a data-adaptive way, or by manual specification • Initial hypotheses from initial word-pronunciation pairs or from 1:1 G2P • Web-derived pronunciations for fully automatic dictionary generation(Ghoshal, 2008)(Schlippe et al., 2014) Tim Schlippe: Methods for Efficient Semi-Automatic Pronunciation Dictionary Bootstrapping

  26. Our approach • We evaluate different common intervals for G2P model updates… and propose an optimized variant. • We combine the output of different converters at the phoneme level.… to improve the accuracy of the created hypotheses. • We investigate the effect of additional word-pronunciation pairs: • Grapheme-based pronunciations (1:1 G2P), • Web-derived pronunciations (Wiktionary), • cross-lingual G2P. • staying language-independent… conducting all experiments in a 6-fold cross-validation Tim Schlippe: Methods for Efficient Semi-Automatic Pronunciation Dictionary Bootstrapping

  27. Experiments: G2P model creation intervals • Frequent generation of G2P models • lowers the human effort by improved G2P converter hypotheses • costs a considerable amount of CPU time for growing dictionaries • a good tradeoff optimizes both criteria • Our tradeoff: Logistic growth • Build G2P models frequently for small dictionaries… where generation is cheap and gain is high • Stable G2P models for larger dictionaries are updated less often… while still obeying an upper bound for updates Tim Schlippe: Methods for Efficient Semi-Automatic Pronunciation Dictionary Bootstrapping

  28. Conclusion Comparison of G2P converter strategies on English Tim Schlippe: Methods for Efficient Semi-Automatic Pronunciation Dictionary Bootstrapping

More Related