70 likes | 227 Views
CALO Decoder Progress Report for April/May. Arthur (Decoder and ICSI Training) Jahanzeb (Decoder) Yitao (Decoder) Ziad (ICSI Training) Moss (ICSI Training) Carnegie Mellon University Apr 13, 2004. This Presentation (5 pages). Progress report for April/May In March
E N D
CALO Decoder Progress Report for April/May Arthur (Decoder and ICSI Training) Jahanzeb (Decoder) Yitao (Decoder) Ziad (ICSI Training) Moss (ICSI Training) Carnegie Mellon University Apr 13, 2004
This Presentation (5 pages) • Progress report for April/May • In March • Sphinx 3.4 is not ready • Just start the script conversion for training • In April/May • Sphinx 3.4 is ready for release • First-cut of AM and LM is done. (Thanks for Rita!)
Decoder • Speed • Compiler Optimization • Doesn’t beat loop-unrolling • However not using –D and using –ffast-math helps • Phoneme-lookahead completes • Accuracy • Train a continuous HMM using all Communicator data. (S2 17% -> S3.4 14%) • 64 mixtures will give us 12% ERR. • Speed not-tuned. • Outlook in next 3 months (in 3.5) • WSJ: potential speed-up problem in task with > 5000 words. • Speaker Adaptation: VTLN, MLLR, and techniques for fast enrollment • Front-end transformation : LDA, HLDA, …… • Model Combination experiment.
Decoder (Software) • Release this week • Mainly to replace buggy s3.3 • Not included (will be in 3.5) • Live mode APIs (Yitao , 80% completion) • Outlook in next 3 months. • Can learn from AHTK 1.3 • Access of the models’ parameters? • Server interface? • Confidence measures?
ICSI Training • Moss –LM training (done) • Arthur/Ziad – 1 meeting training (done) • Rita (Thanks!) – all meetings training (done) • Current results (16 mixture, LM train for in-domain meeting) – 36.4% • Different from the standard test set. • Outlook in next 3 months • Learn the magic from Rita, • Wrap-up training script with perl. • (Optional) Find a better test set. • Start to improve the performance. • With speaker adaptation technology. • Class-based LM • PLP
Infrastructure/Miscellaneous • CVS is setup for • MRCP • ICSI conversion script • Sphinx (will move back to Sourceforge.) • Development is transitioning. • Check-in training scripts? • Scylla and Karybdis are running • On a separate queue. • Documentation for Sphinx. • 3rd draft of outline completed. (9 chapters left.) • …… () • Outlook in next 3 months, • Continue to maintain CVS. User education. • Complete the 2nd draft of the documentation.
Outlook in next 3 months. • Incorporate transform-based technology • Speed-up for task > 5k words. • Further improve ICSI training by all resources • Transition development to CVS.