190 likes | 367 Views
Effects of Query Expansion for Spoken Document Passage Retrieval Tomoyosi Akiba , Koichiro Honda INTERSPEECH 2011. Pei- Ning Chen NTNU CSIE SLP Lab. Outline. Introduction Passage Retrieval for Spoken Document Query Expansion for SDR Experiments Conclusions. Introduction.
E N D
Effects of Query Expansion for Spoken Document Passage RetrievalTomoyosiAkiba, KoichiroHondaINTERSPEECH 2011 Pei-Ning Chen NTNU CSIE SLP Lab
Outline • Introduction • Passage Retrieval for Spoken Document • Query Expansion for SDR • Experiments • Conclusions
Introduction • Because confirming the content of a spoken document requires playing back its audio data, browsing speech data is much more difficult and time-consuming than browsing textual data. • They apply relevance models, a query expansion method, for the spoken document passage retrieval task. They adapted the original relevance model for passage retrieval, and also extended it to benefit from massive collections of Web documents for query expansion.
Retrieval Methods for Passage Retrieval • Using the Neighboring Context to Index the Passage • Passages from the same lecture may be related to each other in the passage retrieval task, whereas the target documents are considered to be independent of each other in a conventional document retrieval task. • Penalizing Neighboring Retrieval Results • In applying context indexing, neighboring passages are liable to be retrieved at the same time as they share the same indexing words.
Query Expansion for SDR • Relevance Models • Extending Relevance Models to Context Indexing • Extending Relevance Models using Web
Linear interpolation: • the two models are linearly interpolated: • Document weighting: • the Web model is used to weight the target documents:
Conclusions • They applied relevance models for the spoken document passage retrieval task. • They also extended it to take advantage of the massive collection of Web documents for query expansion. • In order to improve the performance of their Web extension of relevance models, filtering for noisy Web documents might be necessary. • In future work, we will apply Web document filtering methods to select only the documents most related to the target documents.
Speech Indexing Using Semantic Context InferenceChien-Lin Huang, Bin Ma, Haizhou Li and Chung-Hsien WuINTERSPEECH 2011
Outline • Introduction • Semantic Context Inference • Experiments • Conclusions
Introduction • The indexing techniques of text-based information retrieval have been widely adopted in spoken document retrieval • However, due to imperfect speech recognition results, out-of vocabulary, and the ambiguity in homophone and word tokenization, conventional text-based indexing techniques are not always appropriate for spoken document retrieval
Semantic Context Inference(SCI) • They proposed the semantic context inference representation by finding the semantic relation between terms, and suggesting semantic term expansion for speech indexing
Semantic relation matrix • A spoken document database comprises an accu-mulation of spoken documents from which the document-by-term matrix
SCI for indexing • By summing up all the semantic inference vectors for the spoken document d, we finally obtain the semantic context inference vector
Retrieval model • For spoken document retrieval, we adopt the vector space models which have been widely used in information retrieval by offering a highly efficient retrieval with a feature vector representation for a document
Experiments • To measure the accuracy of retrieved documents and the ranking position of the relevant document, they use the mean average precision to evaluate.
Conclusions • The proposed semantic context inference explores the latent semantic information and extends the semantic related terms to speech indexing. The semantic context inference vector can be regarded as a re-weighing indexing vector which is a way of query expansion to overcome speech recognition errors.