100 likes | 110 Views
This guide provides an in-depth look at implementing and experimenting with language models using the HTK toolkit. It covers steps from database preparation to generating n-grams, concluding with insights on perplexity results and considerations for higher n-gram models. Explore the process, challenges, and conclusions in this comprehensive overview.
E N D
Language model using HTK Raymond Sastraputera
Overview • Introduction • Implementation • Experimentation • Conclusion
Word 1 Word 2 Word 3 Word 4 Introduction • Language model • N-gram • HTK 3.3 • Windows binary
Implementation • Database Preparation • Word map • N-gram file • Mapping OOV words • Vocabulary list
Implementation • Language model generation • Unigram • Bigram • Trigram • 4-gram • Perplexity
Conclusion and Summary • Higher n-gram • Less perplexity • More memory usage • Too high means over fitting • Multiple backed • Waste of time
Reference • 1. HTK (http://htk.eng.cam.ac.uk/)
Thank you • Any Questions?