1 / 1

Introduction The Structured Language Model(SLM)

ended. VP. with. PP. loss. NP. of. contract. PP. NP. cents. NP. loss. NP. Speech Recognizer (Baseline LM). Rescoring (New LM). 100 Best Hyp. Speech. 1 hypothesis. Smoothing Issues in the Strucutred Language Model.

hien
Download Presentation

Introduction The Structured Language Model(SLM)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ended VP with PP loss NP of contract PP NP cents NP loss NP Speech Recognizer (Baseline LM) Rescoring (New LM) 100 Best Hyp Speech 1 hypothesis Smoothing Issues in the Strucutred Language Model The Center for Language and Speech ProcessingThe Johns Hopkins University3400 N. Charles Street, Barton HallBaltimore, MD 21218 Woosung Kim, Sanjeev Khudanpur, and Jun Wu The Center for Language and Speech Processing, The Johns Hopkins University {woosung, sanjeev, junwu}@clsp.jhu.edu Introduction The Structured Language Model(SLM) - An attempt to exploit the syntactic structure of natural language - Consists of a predictor, a tagger and a parser - Jointly assigns a probability to a word sequence and parse structure - Still suffers from data sparseness problem, Deleted Interpolation(DI) has been used  Use of Kneser-Ney smoothing to improve the performance Test Set PPL as a Function of l ASR WER for SWB Experiment Result N-Best Rescoring • Two corpora • Wall Street Journal(WSJ) Upenn Treebank • for LM PPL test • Switchboard(SWB) • For ASR WER test as well as LM PPL • Tokenization • Original SWB tokenization Examples : They’re, It’s, etc.  Not Suitable for syntactic analysis • Treebank tokenization Examples : They ‘re, It ‘s, etc. Kneser-Ney Smoothing Backoff Nonlinear Interpolation Database Size Specifications(in Words) • Concluding Remarks • KN smoothing of the SLM shows modest but consistent improvements • – both PPL and WER • Future Work • SLM with Maximum Entropy Models • But Maximum Entropy Model training requires heavy computation • Fruitful results in the selection of features for the Maximum Entropy Models The Structured Language Model(SLM) Example of a Partial Parse Probability estimation in the SLM Language Model Perplexity The contract ended with a loss of 7 cents after DT NN VBD IN DT NN IN CD NNS Parse tree probability predictor tagger LM PPL parser This research was partially supported by the U.S. National Science Foundation via STIMULATE grant No. 9618874

More Related