1 / 19

Pei- Ning Chen NTNU CSIE SLP Lab

Effects of Query Expansion for Spoken Document Passage Retrieval Tomoyosi Akiba , Koichiro Honda INTERSPEECH 2011. Pei- Ning Chen NTNU CSIE SLP Lab. Outline. Introduction Passage Retrieval for Spoken Document Query Expansion for SDR Experiments Conclusions. Introduction.

semah
Download Presentation

Pei- Ning Chen NTNU CSIE SLP Lab

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Effects of Query Expansion for Spoken Document Passage RetrievalTomoyosiAkiba, KoichiroHondaINTERSPEECH 2011 Pei-Ning Chen NTNU CSIE SLP Lab

  2. Outline • Introduction • Passage Retrieval for Spoken Document • Query Expansion for SDR • Experiments • Conclusions

  3. Introduction • Because confirming the content of a spoken document requires playing back its audio data, browsing speech data is much more difficult and time-consuming than browsing textual data. • They apply relevance models, a query expansion method, for the spoken document passage retrieval task. They adapted the original relevance model for passage retrieval, and also extended it to benefit from massive collections of Web documents for query expansion.

  4. Retrieval Methods for Passage Retrieval • Using the Neighboring Context to Index the Passage • Passages from the same lecture may be related to each other in the passage retrieval task, whereas the target documents are considered to be independent of each other in a conventional document retrieval task. • Penalizing Neighboring Retrieval Results • In applying context indexing, neighboring passages are liable to be retrieved at the same time as they share the same indexing words.

  5. Query Expansion for SDR • Relevance Models • Extending Relevance Models to Context Indexing • Extending Relevance Models using Web

  6. Linear interpolation: • the two models are linearly interpolated: • Document weighting: • the Web model is used to weight the target documents:

  7. Experiments

  8. Experiments

  9. Conclusions • They applied relevance models for the spoken document passage retrieval task. • They also extended it to take advantage of the massive collection of Web documents for query expansion. • In order to improve the performance of their Web extension of relevance models, filtering for noisy Web documents might be necessary. • In future work, we will apply Web document filtering methods to select only the documents most related to the target documents.

  10. Speech Indexing Using Semantic Context InferenceChien-Lin Huang, Bin Ma, Haizhou Li and Chung-Hsien WuINTERSPEECH 2011

  11. Outline • Introduction • Semantic Context Inference • Experiments • Conclusions

  12. Introduction • The indexing techniques of text-based information retrieval have been widely adopted in spoken document retrieval • However, due to imperfect speech recognition results, out-of vocabulary, and the ambiguity in homophone and word tokenization, conventional text-based indexing techniques are not always appropriate for spoken document retrieval

  13. Semantic Context Inference(SCI) • They proposed the semantic context inference representation by finding the semantic relation between terms, and suggesting semantic term expansion for speech indexing

  14. Semantic relation matrix • A spoken document database comprises an accu-mulation of spoken documents from which the document-by-term matrix

  15. SCI for indexing • By summing up all the semantic inference vectors for the spoken document d, we finally obtain the semantic context inference vector

  16. Retrieval model • For spoken document retrieval, we adopt the vector space models which have been widely used in information retrieval by offering a highly efficient retrieval with a feature vector representation for a document

  17. Experiments • To measure the accuracy of retrieved documents and the ranking position of the relevant document, they use the mean average precision to evaluate.

  18. Conclusions • The proposed semantic context inference explores the latent semantic information and extends the semantic related terms to speech indexing. The semantic context inference vector can be regarded as a re-weighing indexing vector which is a way of query expansion to overcome speech recognition errors.

More Related