Mithun Balakrishna , Dan Moldovan and Ellis K. Cave Presenter: Hsuan-Sheng Chiu

N-best list reranking using higher level phonetic, lexical, syntactic and semantic knowledge sources Mithun Balakrishna , Dan Moldovan and Ellis K. Cave Presenter: Hsuan-Sheng Chiu

M. Balakrishna, D. Moldovan, E.K Cave, “N-best list reranking using higher level phonetic, lexical, syntactic and semantic knowledge sources”, ICASSP 2006 • Substantial improvements can be gained by applying a strong postprocessing mechanism like reranking, even at a small n-best depth

Proposed architecture • Reduce LVCSR WER by working these sources in tandem, complementing each other

Features • Score of hypothesis

Features (cont.) • Phonetic features • SVM Phoneme Class Posterior Probability

Features (cont.) • LVCSR-SVM Phoneme Classification Accuracy Probability W

Features (cont.) • Lexical Features • Use n-best list word boundary information (avoid string alignment) and score each hypothesis based on the presence of these dominant words • Syntactic Features • Use a immediate-head parser since the n-best reranking does not impose a left-to-right processing constraint • Semantic Features • Use a semantic parser ASSERT to extract statistical semantic knowledge

Experimental results • Reranking score is a simple linear weighted combination of the individual scores from each knowledge source • The proposed reranking mechanism achieves the best WER improvements at the 15-best depth with 2.9 absolute WER reduction • This is not very surprising since nearly 80% of the total WER improvement list by the Oracle hidden within the 20-best hypotheses

EFFICIENT ESTIMATION OF LANGUAGE MODEL STATISTICS OF SPONTANEOUS SPEECH VIA STATISTICAL TRANSFORMATION MODEL Yuya Akita, Tatsuya Kawahara

Efficient estimation of language model statistics of spontaneous speech via statistical transformation model • Estimate LM statistics of spontaneous speech from a document-style text database • Machine translation model (P(X|Y): translation model) • Transformation model => counts of n-word sequence

SMT-based transformation

Three characteristics of spontaneous speech • Insertion of fillers • Fillers must be removed from transcripts for documentation • Deletion of postpositional particles • Indicating the nominative case re often omitted while possessive case are rarely dropped • Substitution of colloquial expressions • Colloquial expression must be always corrected in document-style text

Transformation probability • Back-off scheme for POS-based model

Experimental setup • Document-style text (for baseline model) • National Congress of Japan • 71M words • Training data for transformation model • 666K words • Test data • 63K words • Comparison corpus • Corpus of Spontaneous Japan • 2.9M words

Experimental results

Mithun Balakrishna , Dan Moldovan and Ellis K. Cave Presenter: Hsuan-Sheng Chiu

Mithun Balakrishna , Dan Moldovan and Ellis K. Cave Presenter: Hsuan-Sheng Chiu

Presentation Transcript

Son Doong Cave

Generating Adjectives to Express the Speaker’s Argumentative Intent

If your name was changed at Ellis Island

Cave Exploration And Technology

Prehistoric Art (Cave Paintings)

MY ELLIS ISLAND EXPERIENCE

Humility in a Cave 1 Samuel 24

Significant Cave Designation Process

ELLIS ISLAND

cave

Mammoth cave

Andy Bunker (AAO), Laurence Eyles, Kuenley Chiu (Univ. of Exeter, UK),

Bayesian Learning for Latent Semantic Analysis

LREC - 2010

Word and Phrase Alignment

Presenter : Shiu , Jia-Hau Advisor : Wang, Sheng-Jyh

LSA, pLSA, and LDA Acronyms, oh my!

WIND CAVE

Word and Phrase Alignment

Arriving on Ellis Island 1892-1924