70 likes | 176 Views
Mandarin STD 2006 Results. http://www.nist.gov/speech/tests/std. Jonathan Fiscus, J érô me Ajot, George Doddington December 14-15, 2006 2006 Spoken Term Detection Workshop. Outline. Evaluation results Corpus statistics Participants Results Future directions. Evaluation Corpora.
E N D
Mandarin STD 2006 Results http://www.nist.gov/speech/tests/std Jonathan Fiscus, Jérôme Ajot, George Doddington December 14-15, 2006 2006 Spoken Term Detection Workshop
Outline • Evaluation results • Corpus statistics • Participants • Results • Future directions
Evaluation Corpora • Data Sources - Rich Transcription 2003 and 2004 Test Set • BNews: China Central Television, Radio Free Asia, New Tang Dynasty TV (~1 hour) • CTS: Hong Kong Univ. of Science and Tech. collection (~1 hour) • Transcripts: • Transcribed by the LDC for EARS • Word segmentation done by a native speaker • We need to check the results to see if this had and impact
Mandarin Term Profile • Followed the same selection protocol English • Trigrams, bigrams, and unigrams • Selected by Y.C. Chang
Conclusions • Evaluation completed • Highest ATWV on CTS data: 0.346 • Still lots to do for this data • Study the impact of word segmentation on scoring • Conditioned analysis needs completed