250 likes | 475 Views
DSP homework 1 HMM Training and Testing. Advisor: Lin-Shan Lee Date: 2007.04.10. Digit Recognizer. Construct a Digit Recognizer Free tools of HMM: HTK http://htk.eng.cam.ac.uk/ Training data, testing data, and some script. http://speech.ee.ntu.edu.tw/courses/DSP/.
E N D
DSP homework 1HMM Training and Testing Advisor: Lin-Shan Lee Date: 2007.04.10
Digit Recognizer • Construct a Digit Recognizer • Free tools of HMM: HTK • http://htk.eng.cam.ac.uk/ • Training data, testing data, and some script. • http://speech.ee.ntu.edu.tw/courses/DSP/
Thanks for HTK! We will focus on these HTK tools
HCopy - Feature extraction HCopy -C lib/hcopy.cfg –S scripts/training_hcopy.scp • Convert wave to 39 dimension MFCC • -C lib/hcopy.cfg • Input and output format e.g. wav -> MFCC_Z_E_D_A • Parameters of feature extraction • Chapter 7 - Speech Signals and Front-end Processing • –S scripts/training_hcopy.scp • A mapping from Input file name to output file name speechdata/training/N110022.wav MFCC/training/N110022.mfc …
Initial MMF prototype • MMF: HTK chapter 7 ~o <VECSIZE> 39 <MFCC_Z_E_D_A> ~h "proto" <BeginHMM> <NumStates> 5 <State> 2 <Mean> 39 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 … <Variance> 39 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 … <State> 3 <Mean> 13 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 … <Variance> 13 … <TransP> 5 0.0 1.0 0.0 0.0 0.0 0.0 0.5 0.5 0.0 0.0 0.0 0.0 0.5 0.5 0.0 0.0 0.0 0.0 0.5 0.5 0.0 0.0 0.0 0.0 0.0 <EndHMM>
HCompV - Initialize HCompV -C lib/config.fig -o hmmdef -M hmm -S scripts/training.scplib/proto • Compute global mean and variance of features • -C lib/config.fig • Set format of input feature. E.g. MFCC_Z_E_D_A • -o hmmdef -M hmm • Set output name: hmm/hmmdef • -S scripts/training.scp • A list of training data. • lib/proto • A description of a HMM model. HTK MMF format
Initial HMM hmm/models • bin/macro • Construct each HMM • bin/models_1mixsil • Add silence HMM • These are written in C. But you can do these by a text editor. hmm/hmmdef
HERest - Adjust HMM model HERest -Clib/config.cfg–S scripts/training.scp –I labels/Clean08TR.mlf -H hmm/macros -H hmm/models –M hmm lib/models.lst • Adjust parameters to maximize P(O|) • one iteration of EM algorithm • Run this command three times => three iteration • –I labels/Clean08TR.mlf • Set label file to “labels/Clean08TR.mlf” • lib/models.lst • A list of word models. ( liN (零), #i (一), #er (二),… jiou (九), sil )
HERest - Adjust HMM model • Chapter 4 - basic problem 3 for HMM • Give O and an initial model =(A,B,),adjust to maximize P(O|)
HHEd – Modify HMMs bin/spmodel_gen hmm/models hmm/models • Add SP (short pause) HMM definition to MMF file “hmm/models” HHEd -H hmm/macros -H hmm/models –M hmm lib/sil1.hedlib/models_sp.lst • lib/sil1.hed • A list of command to modify HMM definitions • lib/models_sp.lst • A new list of model. ( liN (零), #i (一), #er (二),… jiou (九), sil, sp ) • See HTK book 3.2.2 (p. 33)
HERest - Adjust HMM model again HERest -Clib/config.cfg–S scripts/training.scp –I labels/Clean08TR_sp.mlf -H hmm/macros -H hmm/models –M hmm lib/models_sp.lst
HHEd - Increase numbers of mixture HHEd -H hmm/macros -H hmm/models -M hmm lib/mix2_10.hed lib/models_sp.lst • Increase numbers of mixture
HERest - Adjust HMM model again HERest -Clib/config.cfg–S scripts/training.scp –I labels/Clean08TR_sp.mlf-H hmm/macros -H hmm/models –M hmm lib/models_sp.lst
HParse – construct word net HParse lib/grammar_splib/wdnet_sp • lib/grammar_sp • Regular expression • Lib/wdnet_sp • Output Word net
HVite – Viterbi search HVite -H hmm/macros -H hmm/models -S scripts/testing.scp -C lib/config.cfg -w lib/wdnet_sp -l '*' -i result/result.mlf -p 0.0 -s 0.0 lib/dictlib/models_sp.lst • Chapter 8.0 - Search Algorithms for Speech Recognition • -w lib/wdnet_sp • Input word net • -i result/result.mlf • Output MLF file • lib/dict • Dictionary: a mapping from word to phone sequences • E.g. ling -> liN, er -> #er, … . 一 -> sic_i i, 七-> chi_i i
HResult – compare with answers HResults -e "???" sil -e "???" sp -I labels/answer.mlflib/models_sp.lstresult/result.mlf • Longest Common Subsequence (LCS) ====================== HTK Results Analysis ==================== Date: Mon Apr 9 19:34:02 2007 Ref : labels/answer.mlf Rec : result/result.mlf ------------------------ Overall Results -------------------------- SENT: %Correct=91.88 [H=441, S=39, N=480] WORD: %Corr=98.56, Acc=97.41 [H=1713, D=17, S=8, I=20, N=1738] ============================================================
Requirement 1 – run baseline Download homework package • Set PATH for HTK tools • execute (bash shell script): • 01_run_HCopy.sh • 02_run_HCompV.sh • 03_training.sh • 04_testing.sh • You can find acc in result/accuracy
Submit your homework • Request files: • result/accuracy – the best acc you have done • hmm/models – your HMM models • Any script you wrote or modified • A document describe your training process and accuracy • Compress all files above into a zip file named “hw1_b9xxxxxx.zip” • E-mail to TA with subject: [DSP] hw1_b9xxxxx <your name> and your zip file • Deadline: 2007.4.25 0:00
Suggestion and other information • Read HTK book chapter 3 first. (except 3.3, 3.5) • Many hints are in the shell scripts. • Perl or a good editor is your good friend. • If you have problems about bash shell script, see • http://linux.vbird.org/linux_basic/0340bashshell-scripts.php • If you want to run HTK under windows • Cygwin + HTK windows binary. • If you have any question, mail to TA • 呂東烜r95041@csie.ntu.edu.tw