1 / 33

Advanced Question Answering: Plenty of Challenges to Go Around

salena
Download Presentation

Advanced Question Answering: Plenty of Challenges to Go Around

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. Dr. John D. Prange AQUAINT Program Director JPrange@nsa.gov 301-688-7092 http://www.ic-arda.org 25 March 2002 Advanced Question Answering: Plenty of Challenges to Go Around

    2. Outline Introducing ARDA Advanced Question Answering There is Room for Multiple Approaches The AQUAINT Program Challenges from an AQUAINT Perspective Some Final Thoughts . . . Questions and Comments

    3. Introducing ARDA MISSION: Incubate revolutionary R&D for the shared benefit of the Intelligence Community Don’t feel bad if you have not previously heard of ARDA. We are only about 1 and half years old. The acronym ARDA stands for Advance Research and Development Activity in Information Technology. We are a joint Department of Defense and Intelligence Community organization that was established in December 1998. We have a simply stated mission -- Incubate revolutionary R&D for the shared benefit of the Intelligence Community. Easily stated, but not as easily accomplished. But we are trying very hard. ARDA has a modest yet significant budget. Not in the same league as DARPA or NSF, but few organizations are. We are a very small operation -- A total of 6 government staff personnel + several SETA contractors who assist us with the execution of our R&D Programs. Our office is currently located in the R&D Building of the National Security Agency at Fort George G. Meade.Don’t feel bad if you have not previously heard of ARDA. We are only about 1 and half years old. The acronym ARDA stands for Advance Research and Development Activity in Information Technology. We are a joint Department of Defense and Intelligence Community organization that was established in December 1998. We have a simply stated mission -- Incubate revolutionary R&D for the shared benefit of the Intelligence Community. Easily stated, but not as easily accomplished. But we are trying very hard. ARDA has a modest yet significant budget. Not in the same league as DARPA or NSF, but few organizations are. We are a very small operation -- A total of 6 government staff personnel + several SETA contractors who assist us with the execution of our R&D Programs. Our office is currently located in the R&D Building of the National Security Agency at Fort George G. Meade.

    4. What ARDA Does We originate and manage R&D programs With fundamental impact on future operational needs and strategies That demand substantial, long-term venture investment to spur risk-taking That progress measurably toward mid-term and final goals That take many forms and employ many delivery vehicles

    5. How ARDA Interacts Community organizations Plans, forecasts, oversight Customer champions Thrust panels / managers R&D problem statements Internal peer review Industry and academia Principal funding recipients External peer review and staff ARDA may be small, but we utilize our IntelligenceCommunity partners to the maximum extent to: Identify R&D Challenges that are worthy of ARDA’s investment and commitment of time, effort and funding. Serve as Contracting Agents for the vast majority of individual R&D projects Fully participate on R&D Thrust Panels which are chaired by the ARDA R&D Thrust Manager. ARDA is especially interested in soliciting your assistance and involvement with our R&D Thrusts. Virtually all of our budget (minus necessary overhead and administrative costs) is used to fund R&D projects in Industry, Academia, and our National Labs.ARDA may be small, but we utilize our IntelligenceCommunity partners to the maximum extent to: Identify R&D Challenges that are worthy of ARDA’s investment and commitment of time, effort and funding. Serve as Contracting Agents for the vast majority of individual R&D projects Fully participate on R&D Thrust Panels which are chaired by the ARDA R&D Thrust Manager. ARDA is especially interested in soliciting your assistance and involvement with our R&D Thrusts. Virtually all of our budget (minus necessary overhead and administrative costs) is used to fund R&D projects in Industry, Academia, and our National Labs.

    6. Where Is ARDA?

    7. Current ARDA Programs When we talk about ARDA’s Advanced R&D Program, we must start with the participation of our IC partners. They provide us with their most challenging, long-term R&D problems. Working through our R&D Thrust Panel, we collectively has established three current R&D Thrusts. Digital Networking High Performance Computing and the Thrust that you are most interested in: Information Exploitation In addition to these R&D Programs, ARDA also has Fellowship-like program in which we are attempting to encourage world class researchers and scholars from Industry, Academia and the National Labs, to spend a year working in one of the Research organizations within the IC. In the time that I have left I want to concentrate on our Information Exploitation Thrust. My last slide will tell you how you can contact ARDA or myself for more information on any of our programs. When we talk about ARDA’s Advanced R&D Program, we must start with the participation of our IC partners. They provide us with their most challenging, long-term R&D problems. Working through our R&D Thrust Panel, we collectively has established three current R&D Thrusts. Digital Networking High Performance Computing and the Thrust that you are most interested in: Information Exploitation In addition to these R&D Programs, ARDA also has Fellowship-like program in which we are attempting to encourage world class researchers and scholars from Industry, Academia and the National Labs, to spend a year working in one of the Research organizations within the IC. In the time that I have left I want to concentrate on our Information Exploitation Thrust. My last slide will tell you how you can contact ARDA or myself for more information on any of our programs.

    8. Outline Introducing ARDA Advanced Question Answering There is Room for Multiple Approaches The AQUAINT Program Challenges from an AQUAINT Perspective Some Final Thoughts . . . Questions and Comments

    9. Question Answering ala Gary Larson

    10. Open Domain Factoid Question Answering

    11. ARDA & DARPA co-sponsoring the Question Answering Track in the NIST’s organized Text Retrieval Conference (TREC) Program. (Starting with TREC-8 in Nov 1999) TREC-10 Results (Nov 2001): 500- factual questions; About 50 questions had no answer in the TREC-10 Data sources; Used “Real” Questions Data source: approx. 3 GByte database of ~980K news stories 36 US & international organizations participated; 92 separate runs evaluated System output: top 5 regions (50 bytes) in a single story believed to contain Answer to the given question TREC QA Track Results

    12. Pilot Evaluations TREC 10 QA Track The “List Task” Sample Questions: “Name 4 US cities that have a “Shubert” Theater” “Name 30 individuals who served as a cabinet officer under Ronald Reagan” Evaluation Metric: (Number of distinct instances divided by the target number of instances averaged over 25 questions) Top System among 18 runs: Achieved 76% Accuracy The “Context Task” Sample Series of Questions: “How many species of spiders are there?” “How many are poisonous to humans?” “What percentage of spider bites in the US are fatal?” Evaluation Metric: Same as Main Task; 10 Series of Questions; 42 total Questions) Top System: Found answer for 34 of the 42 total questions (81%)

    13. “Ask Jeeves” Approach

    14. Tailored Question Answering Approaches FAQ (Frequently Asked Questions) Help Desks / Customer Service Phone Centers Accessing Complex set of Technical Maintenance Manuals Integrating QA in Knowledge Management and Portals Wide variety of Other E-Business Applications

    15. Structured Knowledge-Base Approach

    16. AQUAINT Advanced QUestion & Answering for INTelligence

    17. AQUAINT Advanced QUestion & Answering for INTelligence

    18. Outline Introducing ARDA Advanced Question Answering There is Room for Multiple Approaches The AQUAINT Program Challenges from an AQUAINT Perspective Some Final Thoughts . . . Questions and Comments

    19. ARDA’s newest major Info-X R&D Program Envisioned as a high risk, long term R&D Program: Phase I Fall 2001 - Fall 2003 Phase II Fall 2003 - Fall 2005 Phase III Fall/Winter 2005 - Fall/Winter 2007 Focus on Final Objective from start Incrementally add media, data sources, & complexity of questions & answers during each phase Each of AQUAINT’s 3 Phases: Use Zero-Based, Open BAA-styled Solicitations Focus on Key Research Objectives Be Closely Linked to Parallel System Integration/Testbed Efforts & Data Collection/Preparation and Evaluation Efforts AQUAINT: ARDA’s Plan of Attack

    20. AQUAINT: R&D Focused on Three Functional Components

    21. Specifically Solicited Research Areas include: 1) Advanced Reasoning for Question Answering 2) Sharable Knowledge Sources 3) Content Representation 4) Interactive Question Answering Sessions 5) Role of Context 6) Role of Knowledge 7) Deep, Human Language Processing and Understanding AQUAINT: Cross Cutting/Enabling Technologies R&D Areas

    22. AQUAINT: Separate, Coordinated Activities

    23. AQUAINT: User Testbed / System Integration Pull together best available system components emerging from AQUAINT Program research efforts Couple AQUAINT components with existing GOTS and COTS software Develop end-to-end AQUAINT prototype(s) aimed at specific Operational QA environments Government-led effort: Directly Linked into Sponsoring Agency’s Technology Insertion Organizations Close, working relationship with working Analysts Provide external system development support Mitre/Bedford will lead External System Integration / Testbed efforts Plan to also utilize additional external researchers as Consultants / Advisors

    24. AQUAINT: Data & Evaluation Issues Data Start by Using Existing Data Collections NIST’s TREC Text Corpora Linguistic Data Consortium (LDC) Human Language Corpora (e.g. TDT, Switchboard, Call Home, Call Friend Corpora) Existing Knowledge Bases and Other Structured Databases Future Data Collection & Annotation and Question/Answer Key Development will be a major effort Will likely use combined efforts of NIST and LDC Evaluation Build upon highly successful TREC Q&A Track Evaluations -- NIST has lead and is currently developing a Phased Evaluation Plan tied to AQUAINT Program Plans Cooperate to maximum extent possible with DARPA’s RKF (Rapid Knowledge Formation) Program Evaluation Efforts

    25. ARDA’s AQUAINT Partners

    26. AQUAINT Program Contractors

    27. AQUAINT Phase I Projects (Fall 01 - Fall 03) Total End-to-End Systems (6)

    28. AQUAINT Phase I Projects (Fall 01- Fall 03)

    29. AQUAINT Phase I Projects (Fall 01- Fall 03)

    30. Northeast Regional Research Center Conduct 6-8 week workshops on multiple AQUAINT-related challenge problems during FY 2002 Sep 2001: Planning Workshop held at MITRE. Attended by Government Technical Leaders, MITRE, and invited set of industrial, FFRDC and Academic researchers in the field Four Potential Challenge Problems identified; Formal Proposals developed for each Challenge Problem Two Full Workshops Funded (Temporal Issues & Multiple Perspectives) One Mini Workshop to further explore challenge problem planned (Re-Use of Accumulated Knowledge)

    31. FY2002 NRRC Wkshp Challenge Problems Temporal Issues Generate Sequence of events and activities along evolving timeline, resolving multiple levels of time references across series of documents/sources. Leader: James Pustejovsky, Brandeis University Multiple Perspectives Develop approaches for handling situations where relevant information is obtained from multiple sources on the same topic but generated from different perspectives (e.g. cultural or political differences). Leader: Jan Wiebe, University of Pittsburgh

    32. NRRC Planning Workshops Re-Use of Accumulated Knowledge Investigate strategies for structuring and maintaining previously generated knowledge for possible future use. E.g. previous knowledge might include questions and answers (original and amplified) as well as relevant and background information retrieved and processed. Leaders: Marc Light, MITRE and Abraham Ittycheriah, IBM

    33. Supporting Roles

    34. Outline Introducing ARDA Advanced Question Answering There is Room for Multiple Approaches The AQUAINT Program Challenges from an AQUAINT Perspective Some Final Thoughts . . . Questions and Comments

    35. Top 10 Challenges

    36. For ARDA and AQUAINT they are: Intelligence Community and Military Analysts But there are other Potential Target Audiences of “Professional Information Analysts”: Investigative / “CNN-type” Reporters Financial Industry Analysts / Investors Historians / Biographers Lawyers / Law Clerks Law Enforcement Detectives And Others Professional Information Analysts: Target Audience for AQUAINT -- Who are They?

    37. They are far more than just casual users of information They work in an information rich environment where they have access to large quantities of heterogeneous data They are almost always subject matter experts within their assigned task areas They track and follow a given event, scenario, problem, or situation for an extended period of time They frequently have extensive collaboration with other analysts They are focused on their assigned task or mission and will do whatever it takes to accomplish it The end product that results from their analysis is often judged against the standards of: Timeliness Accuracy Usability Completeness Relevance Professional Information Analysts: What do They have in Common?

    38. Top 10 Challenges 1) Satisfy QA requirements of the “Professional” Information Analyst 2) Pursue QA Scenarios and not just isolated, factually based QA

    39. Implications of QA Scenarios Requires handling a Full Range of Complexity & Continuity of Questions Need to understand & track the analysts’ line of reasoning and flow of argument QA System requires significantly greater insight into knowledge, desires, past experiences, likes and dislikes of “Questioner” Place much higher value on recognizing and capturing “background” information Questioner/System dialogue is now more than just a means for clarification

    40. AQUAINT: Intermediate Goals

    41. Top 10 Challenges 1) Satisfy QA requirements of the “Professional” Information Analyst 2) Pursue QA Scenarios and not just isolated, factually based QA 3) Support a collaborative, multiple analyst environment

    42. Collaboration within QA Standard Collaboration (From an Analyst Perspective) Who else is working all or a portion of my task? What do they know that I don’t and vice versa? Can we share/work together? Non-Standard Discovery (From a System Perspective) Identify previous QA Scenarios that have “similarity” to current QA Scenario. Compare & Contrast Use / Build-on / Update previous results Uncover new data sources Borrow a successful “line of reasoning” or “argument flow” Alerts analyst to different interpretations or to overlooked / undervalued data

    43. Top 10 Challenges 1) Satisfy QA requirements of the “Professional” Information Analyst 2) Pursue QA Scenarios and not just isolated, factually based QA 3) Support a collaborative, multiple analyst environment 4) Some times SMALL things really matter and other times BIG things don’t

    44. “Small & Big” - Can we tell the difference? Some times SMALL differences can produce significantly different results/interpretations: Stop Words “Books {by; for; about} kids” Attachments “The man saw the woman in the park with the telescope.” Co-reference “John {persuaded; promised} Bill to go. He just left.” “Mary took the pill from the bottle. She swallowed it.” Other times BIG differences can produce the same/similar results: “Name the films in which Denzel Washington starred.” “Denzel Washington played a leading role in which movies?” “In what Hollywood productions did Denzel Washington receive top billing?”

    45. Top 10 Challenges 1) Satisfy QA requirements of the “Professional” Information Analyst 2) Pursue QA Scenarios and not just isolated, factually based QA 3) Support a collaborative, multiple analyst environment 4) Some times SMALL things really matter and other times BIG things don’t 5) Advanced QA must attack the “Data Chasm”

    46. Attacking the Data Chasm

    47. Some Challenges: Alternative Wording *

    48. Some Challenges: Synthesizing Info *

    49. Some Challenges: Evolving Info *

    50. Attacking the Data Chasm

    51. AQUAINT: Data Types

    52. AQUAINT: Data Types

    53. AQUAINT: Phase I Data Dimensions

    54. AQUAINT: Phase I Data Dimensions

    55. Top 10 Challenges 1) Satisfy QA requirements of the “Professional” Information Analyst 2) Pursue QA Scenarios and not just isolated, factually based QA 3) Support a collaborative, multiple analyst environment 4) Some times SMALL things really matter and other times BIG things don’t 5) Advanced QA must attack the “Data Chasm” 6) Time is of the Essence

    56. Time: Our Achilles Heel? The Obvious Timeliness Issue: The timeliness of the system’s response to our question(s) -- we’ll need at least “near real time responses” But Real Difficulties Still Exist in: Extracting, correctly interpreting time references & then creating manageable timelines Estimating & updating changing reliability of information over time Processing information in time sequence e.g. Tracking the details of an evolving event over time -- A whole different set of problems

    57. Temporal Issues Time References vary from precise to vague Precise/Pinpointed: “0930-1030 hours 25 March 2002” Vague: “Recently” or “a year or so ago” or “In my youth” Nested Time References: (e.g. Within a Newspaper article) Current time of Reader Time Article was Published Time of the Reported Event(s) Time References into Past or Future Temporally-based Questions are difficult because they refer to: Temporal properties of the entities being questioned Relative ordering of events in the world Events that are mentioned in news articles, but which have not or did not occurred at all

    58. Top 10 Challenges 7) Must extract, represent and preserve information uncovered when searching for answers

    59. A Different Paradigm may be useful when handling QA Scenarios: Current Analytic Paradigm: QA Scenarios: A Different Paradigm?

    60. Different Paradigm: “Casting a Net”

    61. Top 10 Challenges 7) Must extract, represent and preserve information uncovered when searching for answers 8) Rapidly increasing importance of Knowledge of all types -- regardless of the approach

    62. Complex QA: The Need for Ever Increasing Knowledge -- Of All Types

    63. Increasing Knowledge Requirements Types of Knowledge Needed Factual Knowledge & Linguistic Knowledge Common Sense Knowledge & World Knowledge Procedural Knowledge & Explanatory Knowledge Domain Knowledge & Modal Knowledge Tacit Knowledge Etc. Sources Hand Crafted by experts; supplemented by end-users Results from application of: Learning algorithms Bootstrapping / Hill-climbing Methods Extracted from large data corpora Obtained via “Re-Use”

    64. WordNet Extensions * WordNet: WordNet is a lexical database of English nouns, verbs, adjectives, and adverbs Entries are lexicalized concepts that consist of one or more synonyms, a definitional gloss, and links to semantically related entries Extensions: Moving towards WordNet 2.0 Derivational Connections: Adding links between morphologically related nouns and verbs (e.g. digest and digestion) Disambiguated Definitions: Demonstrate and Demonstration each have multiple definitions – Adding links between meanings that match Topical Connections: Adding topical access by creating lists of lexicalized concepts that frequently co-occur in discussions of a given topic

    65. Knowledge Evolution Tools * KB development requires knowledge evolution Debugging, refining, structuring, modularizing, … Power tools are needed to support KB evolution KB diagnosis Bugs, omissions, heuristic warnings, architectural advice KB merging To enable interoperation of KBs with overlapping content KB partitioning To enable effective reasoning To produce reusable KB building blocks

    66. Merging Knowledge Bases *

    67. Using Knowledge within Advanced QA Systems * Use Formalized knowledge for: Semantic understanding of queries; Discovery of Answers by Reasoning; Justification of answers; Use Formalized knowledge as: Format for data normalization ‘Glue’ for data integration of: information extracted from unstructured data SQL queries against structured DBs Cyc’s knowledge

    68. Discovery of Answers by Reasoning *

    69. Where Knowledge-Systems Help * Heuristic of finding short passages with all the query terms/semantic classes is good but not sufficient. e.g. from TREC9:

    70. Different Solution Approaches * What is the largest city in England? Text Match Find text that says “London is the largest city in England” (or paraphrase). Confidence is confidence of NL parser * confidence of source. Find multiple instances and confidence of source -> 1. “Superlative” Search Find a table of English cities and their populations, and sort. Find a list of the 10 largest cities in the world, and see which are in England. Uses logic: if L > all objects in set R then L > all objects in set E < R. Find the population of as many individual English cities as possible, and choose the largest. Heuristics London is the capital of England. (Not guaranteed to imply it is the largest city, but this is very frequently the case.) Complex Inference E.g. “Birmingham is England’s second-largest city”; “Paris is larger than Birmingham”; “London is larger than Paris”; “London is in England”.

    71. Top 10 Challenges 7) Must extract, represent and preserve information uncovered when searching for answers 8) Rapidly increasing importance of Knowledge of all types -- regardless of the approach 9) Expanding requirements for more advanced learning and reasoning methods/approaches

    72. Improved Reasoning & Learning

    73. Improved Reasoning & Learning

    74. Unsolved Problems Developing / Implementing a Detailed, Complex Plan to Solve the QA Task at Hand Decomposing Complex Questions into a series / sequence of Simpler Questions whose Answers can be found Selecting the appropriate sources to search Knowing when No Answer is Available; Being able to then give a partial, incomplete answer Giving understandable explanations of the “Plan”, the “Reasoning Used” and the “Answers Found”

    75. Increased Emphasis on Planning * QA as Planning Create a general QA planning system How should a QA system represent its chain of reasoning? QA and Auditability How can we improve a QA system’s ability to justify its steps? How can we make QA systems open to machine learning?

    76. Utility Function Supports QA * Utility-Based Information Fusion Perceived utility is a function of many different factors Create and tune utility metrics, e.g.:

    77. Planner * Specify planning representation Identify decision points Represent & manage uncertainty Model states and operations Model justification network Find acceptable trade-offs: Ratio of planning to execution Answer utility vs. available resources

    78. An Asymmetric Threat Scenario *

    79. An Asymmetric Threat Scenario *

    80. Computational Implicatures * The Problem A professional analyst cannot separate his/her intentions and beliefs from the formulation of a question. Sometimes the analyst makes a proposal or assertion. Implied information, important for the interpretation of a question. Not recognizable at syntactic or semantic level. Determines the quality of answers returned by the Q/A system.

    81. Example of Computational Implicature * “Will Prime Minister Mori survive the crisis?” Implied belief: the position of Prime Minister is in jeopardy. Problem: none of the question words indicate directly danger. Question expected answer type: survival Implicature: DANGER

    82. Top 10 Challenges 7) Must extract, represent and preserve information uncovered when searching for answers 8) Rapidly increasing importance of Knowledge of all types -- regardless of the approach 9) Expanding requirements for more advanced learning and reasoning methods/approaches 10) Discovering the correct answer will be hard enough; but crafting an appropriate, articulate, succinct, explainable response will be even harder

    83. Difficulties in Generating Answers Natural Language Generation continues to be a difficult, open research area. Adding the requirement to generate multimedia answers makes this problem even harder. Providing the ability to explain and/or justify answers also continues to be a difficult, open research area. The more complex the line or chain of reasoning, the more complex the explanation and/or justification QA Scenarios and differences across analysts add additional levels of complexity. The Same Question asked within different scenarios by different analysts could easily produce substantially: Different Answer content Different Answer format, structure, depth and/or breadth of coverage Or both

    84. Outline Introducing ARDA Advanced Question Answering There is Room for Multiple Approaches The AQUAINT Program Challenges from an AQUAINT Perspective Some Final Thoughts . . . Questions and Comments

    85.

    86. Five Final Thoughts Is ARDA and AQUAINT’s Vision for Advanced Question Answering Achievable? I strongly believe that it can be done. Maybe not exactly in the form envisioned and to the full extent hoped for. But having such a vision allows us to: Identify key, strategic objectives Attack the final goal simultaneously across a broad front and along multiple avenues To take bigger R&D “steps” with greater confidence

    87. Five Final Thoughts 2. Research is about discovering the unexpected. We must be willing to change direction and course to capitalize on our “discoveries”.

    88. Five Final Thoughts 3. Failing is ok, even expected; It’s what we do with our failures that matters

    89. Five Final Thoughts 4. We must not forget that the ultimate goal is to transfer research results into operational use. So we need to constantly strive to have a measurable, practical impact.

    90. Five Final Thoughts The Technical Challenges are many, and the Road towards our Final Vision may be long and bumpy. . . But the final results will make these struggles well worth the effort !

    91. Contact Information Dr. John Prange, AQUAINT Program Director ARDA Web Pages: http://www.ic-arda.org Email arda@nsa.gov JPrange@nsa.gov Phones: 301-688-7092 800-276-3747 301-688-7410 (Fax) Mailing: ARDA Room 12A69 NBP#1 STE 6644 9800 Savage Road Fort Meade, MD 20755-6644

    92. Advanced Question Answering: Plenty of Challenges to Go Around

    93. Dr. John D. Prange AQUAINT Program Director JPrange@nsa.gov 301-688-7092 http://www.ic-arda.org 25 March 2002 Advanced Question Answering: Plenty of Challenges to Go Around

More Related