1 / 18

The Question Generation Task

The Question Generation Task. Vasile Rus, Zhiqiang Cai, and Art Graesser. Outline. Shared task for NLG? Why is question generation important? Landscape of example questions Definition of Question Generation Subtasks Evaluation Methodologies Black-box vs. Glass-box Manual vs. Automatic

melania
Download Presentation

The Question Generation Task

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Question Generation Task Vasile Rus, Zhiqiang Cai, and Art Graesser

  2. Outline • Shared task for NLG? • Why is question generation important? • Landscape of example questions • Definition of Question Generation • Subtasks • Evaluation Methodologies • Black-box vs. Glass-box • Manual vs. Automatic • Data sets

  3. NLG: Shared Task(s) or Not? • Shared Tasks (3) • Pros • Define evaluation metrics • Compare approaches to the chosen task • Monitor task • Community wide efforts are needed for building resources, infrastructure • Bring the community together • increase visibility of NLG • Cons • Too much effort spent on the chosen task • Shadow other basic research effort

  4. What Shared Task(s)? • Principle: due to the inherent difficulty of Language Generation choose a (relatively) simple task • Question Answering has avoided deep questions • Summarization focuses on extractive summaries • Textual Entailment = text understanding? • Full-fledged NLU evaluation?

  5. Why is Question Generation Important? • Help systems and FAQ facilities need example questions for users to model • Information retrieval queries need suggested revised questions • A need for automated systems with proactive question asking and answering • Intelligent tutoring systems need automated hints and other question probes

  6. Who may care about Question Generation? • Natural Language Generation community • Learning Technologies community • Intelligent Tutoring Systems • Subject testing (ETS) • Question Answering community

  7. Landscape of Questions to Generate (Graesser and Person,1994; Lehnert, 1978) LEVEL 1: SIMPLE or SHALLOW 1. Verification Is X true or false? Did an event occur? 2. Disjunctive Is X, Y, or Z the case? 3. Concept completion Who? What? When? Where? 4. Example What is an example or instance of a category?). LEVEL 2: INTERMEDIATE 5. Feature specification What qualitative properties does entity X have? 6. Quantification What is the value of a quantitative variable? How much? 6. Definition questions What does X mean? 8. Comparison How is X similar to Y? How is X different from Y? LEVEL 3: COMPLEX or DEEP 9. Interpretation What concept/claim can be inferred from a pattern of data? 10. Causal antecedent Why did an event occur? 11. Causal consequence What are the consequences of an event or state? 12. Goal orientation What are the motives or goals behind an agent’s action? 13. Instrumental/procedural What plan or instrument allows an agent to accomplish a goal? 14. Enablement What object or resource allows an agent to accomplish a goal? 15. Expectation Why did some expected event not occur? 16. Judgmental What value does the answerer place on an idea or advice?

  8. Question Generation • Input: one or more sentences • Output: set of questions related to the input text

  9. Examples • AutoTutor • INPUT: There are no horizontal forces on the packet after release. • OUTPUT: What can you say about the horizontal forces on the packet? • NIST QA track • INPUT: But here is who will actually direct Dreamgirls -- none other than Frank Oz, the voice of Miss Piggy on the Muppets. • OUTPUT: Who is the voice of Miss Piggy?

  10. Subtasks - Input • INPUT • Input one sentence • Input one paragraph • Input specified in a formalism appropriate for Language Generation

  11. Subtasks - Output • OUTPUT • Subtask 1: generate question containing only words from input • Subtask 2: generate questions containing only words from input, except for one word • Subtask 3: generate questions containing replaced phrases from input • Subtask 4: generate WHO questions, WHEN questions, etc. • Subtask 5: freely generate questions

  12. Evaluation • Black-box • Simply look at the quality of the output • Glass-box • Some subtask are designed to test for particular components of language generation • Subtask 1 is suitable for testing syntactic variability and microplanning • Subtask 2 is suitable for testing lexical generation

  13. Evaluation • Manual • Human experts judge the questions on quality and/or relevance • What is a good question? • Automatic • Suitable for some subtasks • Use automatic evaluation techniques from summarization (extractive summarization)

  14. Evaluation - Metrics • Precision • Recall • Prepare a set of good questions for each input • Re-use existing data, e.g. NIST QA data • Use NIST method: • Collect all good questions from all submissions and use it as the pool of GOLD STANDARD questions • Ranking: MRR (mean reciprocal rank) • Confidence measure: confidence weighted measure

  15. Data • AutoTutor • Hints and prompts to elicit physics principles • Expert-generated questions in curriculum scripts • NIST QA track • Thousands of Question-Answer pairs • Manipulate existing data • New data

  16. Pros and Cons • Pros: • Textual input could help with wide adoption • Suitable for glass- and black-box evaluation • Automatic evaluation is possible • Data sets already available or almost available • Cons: • Discourse planning • Alternative: generate set of related questions where anaphora and other discourse aspects are present • Pre-posed context clause • Fundamental issue: • What is a good question?

  17. Summary • Simple and attractive • Automatic evaluation possible • Data sets available

  18. Thank You!

More Related