1 / 21

Deep Processing for Restricted Domain QA

Deep Processing for Restricted Domain QA. Yi Zhang Universit ä t des Saarlandes yzhang@coli.uni-sb.de. Why Deep?. Is Shallow Processing Enough? For TREC-like QA evaluation (in most cases) YES However, for restricted domain QA More complicated questions

lotus
Download Presentation

Deep Processing for Restricted Domain QA

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Deep Processing for Restricted Domain QA Yi Zhang Universität des Saarlandes yzhang@coli.uni-sb.de

  2. Why Deep? Is Shallow Processing Enough? • For TREC-like QA evaluation • (in most cases) YES • However, for restricted domain QA • More complicated questions • Less information redundancy for data intensive approach • Domain knowledge available

  3. Deep Processing Provides • More fine-grained linguistic analysis • Long distance dependency • Agreements • … • Semantic Representation • MRS/RMRS

  4. General Problems with Deep Processing • Robustness • Lexicon • Compound NP • Specificity • “John saw Mary” • Efficiency (not discussed here)

  5. Deep Processing • MRS/RMRS • (Robust) Semantic representation with underspecification. • HPSG Grammars • LinGO ERG Grammar • Other grammars (German, Japanese, Modern Greek, Norwegian, Chinese, …) • HoG • Hybrid shallow & deep processing architecture with uniformed semantic representation (RMRS).

  6. QA in QUETAL (1) • Hybrid shallow & deep approach • Cross-lingual QA • QA on • Texts • Semi-structured documents • Database

  7. Info Source Texts IE Fact DB QA in QUETAL (2) • Seman Ana. • Seman Q. Ana. • Q-type • A-type • Q-focus NLQ • Syntax Ana. • Dependency Parser • TAG for En/De Q. IR Schema Ans. Planning & Generation GetData IR Query Planner Result Merge

  8. QA in QUETAL (3) Deep processing in QUETAL • HPSG grammar used for question analysis. • Documents are processed with relatively shallow methods. • Answer matching with RMRS.

  9. Restricted Domain QA • More complicated questions • Less documents with better quality • Domain specific ontology available

  10. Restricted Domain QA – an Example Where is the City Hall of Shanghai? Shanghai City Planning Exhibition Hall[LOC_1] is located to the east of the City Hall[LOC_2], …, setting off with the crystal-like GrandTheatre[LOC_3]to the west. Between Shanghai City Planning Exhibition Hall and the Grand Theatre. Domain Onto.

  11. Open Topics • Grammar extension & automated lexicon acquisition • Robust deep processing • Semantic answer matching • Cross-lingual

  12. Grammar Extension Tourism Domain • ERG extended for • “RONDANE” -- Norway mountain area tourism • 1.4K sentences • 15 word/sentence • coverage > 74% • Shanghai tourist guide from http://www.shanghai.gov.cn • 1,600 sentences • 18 word/sentence

  13. Test on RONDANE corpus

  14. Test on RONDANE Corpus

  15. Grammar Extension • ERG lexicon • It is relatively easier to automated the lexicon acquisition for nouns

  16. Automated Lexicon Acquisition • POS tagging • Name entity recognition • Statistical models finding the best lexical type for unknown noun.

  17. Robust Deep Processing • Back-off to RMRS generated with intermediate or shallow parsers (HoG architecture). • Keep non-full parsing charts and corresponding MRS fragments for semantic answer matching.

  18. Parse Disambiguation • Select the best parse with statistical models (Toutanova et al. 2002)

  19. Answer Matching with (R)MRS • Semantic answer matching • Create semantic patterns for each question type. • where -> locate_v(e, x1, x2) • Semantic distance measurement. • pred1(x)&pred2(x) <-> pred1(x)&pred2(y) • Query expansion • Synonym substitution • Semantic structure replacement • give_v(e1, x1, x2, x3) => receive_v(e2, x2, x1, x3)

  20. Work Plan • Narrow down my focus onto one of the topics above. • Continue the Chinese HPSG grammar development.

  21. References • Baldwin, Timothy, Emily M. Bender, Dan Flickinger, Ara Kim and Stephan Oepen (to appear) Road-testing the English Resource Grammar over the British National Corpus, In Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC 2004), Lisbon, Portugal. • Ulrich Callmeier. 2002. PET – a platform for experimentation with efficient HPSG processing techniques. In Collaborative Language Engineering. CSLI Publications, Stanford, USA. • Hans Uszkoreit. 2002. New chances for deep linguistic processing. In Proc. of the 19th International Conference on Computational Linguistics (COLING 2002), Taipei, Taiwan. • Ann Copestake, Dan Flickinger, Ivan A. Sag, and Carl Pollard. 2003. Minimal recursion semantics: An introduction. Under review. • Timothy Baldwin and Francis Bond. 2003. Learning the countability of English nouns from corpus data. In Proc. of the 41st Annual Meeting of the ACL, pages 463–70, Sapporo, Japan. • Carol, J. and Fang, A. Automatic Acquisition of Verb Subcategorisations and their Impact on the Performance of an HPSG Parser. IJCNLP 2004 • Oepen, Stephan, Dan Flickinger, Kristina Toutanova, Christoper D. Manning. 2002. LinGO Redwoods: A Rich and Dynamic Treebank for HPSG In Proceedings of The First Workshop on Treebanks and Linguistic Theories (TLT2002), Sozopol, Bulgaria. • Toutanova, Kristina, Christoper D. Manning, Stephan Oepen. 2002. Parse Ranking for a Rich HPSG Grammar In Proceedings of The First Workshop on Treebanks and Linguistic Theories (TLT2002), Sozopol, Bulgaria. • Stephan Oepen. [incr tsdb()] - Competence and Performance Laboratory. User Manual.Technical Report. Computational Linguistics. Saarland University (in preparation). • Robert Malouf and Gertjan van Noord. 2004. "Wide coverage parsing with stochastic attribute value grammars." In IJCNLP-04 Workshop: Beyond shallow analyses - Formalisms and statistical modeling for deep analyses. • Toutanova, Kristina, Christopher D. Manning, Stuart M. Shieber, Dan Flickinger, and Stephan Oepen. 2002. Parse Disambiguation for a Rich HPSG Grammar. First Workshop on Treebanks and Linguistic Theories (TLT2002), pp. 253-263. Sozopol, Bulgaria.

More Related