1 / 35

QUALIFIER in TREC-12 QA Main Task

QUALIFIER in TREC-12 QA Main Task. Hui Yang, Hang Cui, Min-Yen Kan, Mstislav Maslennikov, Long Qiu, Tat-Seng Chua School of Computing National University of Singapore Email: yangh@comp.nus.edu.sg. Outline. Introduction Factoid Subsystem List Subsystem Definition Subsystem Result

max
Download Presentation

QUALIFIER in TREC-12 QA Main Task

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. QUALIFIER in TREC-12 QA Main Task Hui Yang, Hang Cui, Min-Yen Kan, Mstislav Maslennikov, Long Qiu, Tat-Seng Chua School of Computing National University of Singapore Email: yangh@comp.nus.edu.sg

  2. Outline • Introduction • Factoid Subsystem • List Subsystem • Definition Subsystem • Result • Conclusion and Future Work

  3. Introduction • Given a question and a large text corpus, return an “answer” rather than relevant “documents” • QA is at the intersection of IR + IE + NLP • Our system - QUALIFIER • Consists 3 subsystems • External Resources – Web, WordNet, Ontology • Event-based Question Answering • New Modules introduced

  4. Outline • Introduction • Factoid Subsystem • List Subsystem • Definition Subsystem • Result • Conclusion and Future Work

  5. Factoid System Overview

  6. Factoid Subsystem • Detailed Question Analysis • QA Event Construction • QA Event Mining • Answer Selection • Answer Justification • Fine-grained Named Entity Recognition • Anaphora Resolution • Canonicalization Coreference • Successive Constraint Relaxation

  7. Factoid Subsystem • Detailed Question Analysis • QA Event Construction • QA Event Mining • Answer Selection • Answer Justification • Fine-grained Named Entity Recognition • Anaphora Resolution • Canonicalization Coreference • Successive Constraint Relaxation

  8. Why Event-based QA - I • The world consists of two basic types of things:entitiesand events and people often ask questions about them. • From Question Answering’s Point of View • Questions = “enquiries about entities or events”.

  9. Why Event-based QA - II • QA Entities • “Anything having existence (living or nonliving)” • E.g. “What is the democratic party symbol?” • QA Events • “Something that happens at a given place and time”. • E.g. “How did donkey become democratic party symbol?” Thomas Nast 1870 Harper’s Weekly cartoon

  10. Why Event-based QA - III Table 1: Correspondence of WH-Questions & Event Elements • Entity Questions • Properties, or • entities themselves • definition questions. • Event Questions • Elements of events • Location, • Time, • Subject, • Object, • Quantity • Description • Action, etc. question :== event | event_element | entity | entity_property event :== { event_element } event_element :== time | location | subject | object | quantity | description | action | other entity :== object | subject entity_property :== quantity | description | other

  11. Event-based QA Hypothesis • Equivalency:  QA event Ei,Ej ,if all_elements(Ei) = all_elements(Ej), then Ei = Ej, and vice versa; • Generality: if all_elements(Ei) is a subset of all_elements(Ej), then Ei is more general than Ej; • Cohesiveness: if elements a, b both belong to an event Ei, and a, c do not belong to a known event, then co-occurrence(a,b) is greater than co-occurrence(a,c); • Predictability: if elements a, b both belong to an event Ei, then a => b and b => a.

  12. QA Event Space • Consider an event to be a point in a multi-dimensional QA event space. • If we know all the elements about an event, then we can easily answer different questions about it • E.g. “When did Bob Marley die ?” • As there are innate associations among these elements if they belong to the same event (Cohesiveness), we can use what are already known • To narrow the search scope • To find rest of the unknown event elements, the answer (Predictability)

  13. Problems to be Solved • However, for most of the cases, it is difficult to find the correct unknown element(s), i.e., the correct answer • Two major problems: • Insufficient known elements • Inexact known elements • Solution: • Explore the use of world knowledge (Web and WordNet glosses) to find more known elements • Exploit the lexical knowledge from (WordNet synsets and morphemics) to find exact forms.

  14. How to Find a QA Event • Using Web • From original query term q(0) , retrieve top N web documents •  qi(0)q(0), extract nearby non-trivial words in one sentence or n words away (in Cq ) and rank them by computing its probability of correlation with qi(0) • Using WordNet •  qi(0)q(0), extract terms that are lexically related to qi(0) by locating them in Gloss Gq and Synset Sq • Combine the external knowledge resources to form term collection: Kq = Cq + (GqSq)

  15. QA Event Construction • Structured Query Formulation • We perform structural analysis on Kq to form semantic groups of terms Given any two distinct terms ti, tjKq, we compute their • Lexical correlation • Co-occurrence correlation • Distance correlation

  16. QA Event Construction • For example, “What Spanish explorer discovered the Mississippi River?” The final Boolean query becomes: “(Mississippi) & (French|Spanish) & (Hernando & Soto & De) & (1541) & (explorer) & (first | European |river)”.

  17. QA Event Mining • Extract important association rules among the elements by using data mining techniques. • Given a QA event Ei, we define X, Y as two sets of event elements. • Event mining studies the rules of the form X  Y, where X, Y are QA event element sets, X  Y =, and Y {elementoriginal }=. • if X  Y , ignore X  Y. • if cardinality(Y) > 1, ignore X  Y. • if Y {elementoriginal }, ignore X  Y.

  18. Passage & Answer Selection • Select Passage based on Answer Event Score (AES) from the relevant documents in the QA corpus: • Support (X  Y) = • Confidence (X  Y) = • The weight for answers candidate j is defined as:

  19. Related Modules: Fine-grained Named Entity Recognition • Fine-grained NE Tagging • Non-ascii Character Remover • Number Format Converter • E.g. “one hundred eleven” => 111 • Rule Confliction Revolver • Longer Length • Ontology • Handcrafted Priorities

  20. Related Modules: Answer Justification • We generate axioms based on our manually constructed ontology. For example, • q1425: What is the population of Maryland? • Sentence: “Maryland 's population is 50,000 and growing rapidly.” • Ontology Axiom (OA): Maryland (c1) & population (c1, c2) -> 5000000(c2) • In this way, we could identify the wrong answer “50000”, which is the surface text shown.

  21. Factoid Results

  22. Factoid Results

  23. Outline • Introduction • Factoid Subsystem • List Subsystem • Definition Subsystem • Result • Conclusion and Future Work

  24. List System Overview

  25. List Subsystem • Multiple Answers from Same Paragraph • Canonicalization Resolution • Unique answer • “the States” , “USA”, “United States”, etc • Pattern-based Answer Extraction • <same_type_NE>, <same_type_NE> and <same_type_NE> + verb … • … include: <same_type_NE>, <same_type_NE>, <same_type_NE> … • “list of …” • “top” + number + adj-superlative

  26. List Results

  27. Outline • Introduction • Factoid Subsystem • List Subsystem • Definition Subsystem • Result • Conclusion and Future Work

  28. System Overview

  29. Definition Subsystem

  30. Definition Subsystem • Pre-processing • document filter • anaphora resolution • sentence “positive set” and “negative set” • Sentence Ranking • Sentence weighting in Corpus • Sentence weighting in Web • Overall weighting :

  31. Definition Subsystem • Answer Generation (Progressive Maximal Margin Relevance) • All sentences are ordered in descending order by weights. • Add the first sentence to the summary. • Examine the following sentences. If Weight(stc)- Weight(next_stc) >avg_sim(stc), Add next_stc to summary; • Go to Step 3) till the length limit of the target summary is satisfied.

  32. Definition Results • We empirically set the length of the summary for People and Objects based on question classification results.

  33. Outline • Introduction • Factoid Subsystem • List Subsystem • Definition Subsystem • Result • Conclusion and Future Work

  34. Overall Performance

  35. Conclusion and Future Work • Conclusion • Event-based Question Answering • Factoid question and list questions explore the power of Event-based QA • Definition questions answering combines IR and Summarization • Use Ontology to boost the performance of our NE and answer justification modules • Future Work • Give a formal proof of our QA event hypothesis • Working towards an online question answering system • Interactive QA • Analysis and opinion questions • VideoQA – question answering on news video

More Related