440 likes | 667 Views
AQUAINT Program: Overview. Dr. John Prange, Info-X R&D Thrust Director Dr. Lynn Franklin, Dep Info-X R&D Thrust Director jprange@nsa.gov; lynn.franklin@pnl.gov 443-479-8006 (Prange) / 443-479-6604 (Franklin) 301-688-7092 (ARDA Office) http://www.ic-arda.org October 2004. Question ????.
E N D
AQUAINT Program: Overview Dr. John Prange, Info-X R&D Thrust Director Dr. Lynn Franklin, Dep Info-X R&D Thrust Director jprange@nsa.gov; lynn.franklin@pnl.gov 443-479-8006 (Prange) / 443-479-6604 (Franklin) 301-688-7092 (ARDA Office) http://www.ic-arda.org October 2004
Question ???? ??? Let’s Start with a Simple, Factual, Question --- How Do We Find Information Today? Where is the Taj Mahal?
System Specific Query e.g. Boolean Key Word Equation Data Source e.g Large Text Archive Traditional Information Retrieval Ranked List of Hopefully “Relevant” Documents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Traditional Information Retrieval (IR) Approach Question ?
Answer: Agra, India Use Your Favorite Search Engine Where is the Taj Mahal? Or Is It ??? It Depends !!!
Where is the Taj Mahal (“Hotel”)? Answer: Bombay (Mumbai), India Alternative Answer #1
Where is the (“Trump”) Taj Mahal? Answer: Atlantic City, NJ Alternative Answer #2
Where is the Taj Mahal (“Restaurant”)? Answer: Utrecht, Netherlands Alternative Answer #3
Move Closer to the Question e.g. Question Classification System Specific Query; often Tailored to Question Type Traditional Information Retrieval Single Data Source Ranked List of Hopefully “Relevant” Documents QA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shallow Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Move Closer to the Answer e.g. Passage Retrieval “Answer” Next Generation Approaches:Question Answering (QA) Systems Single, Factoid Question ?
“Ask Jeeves” Approach • Start with Your Question • Identify Key Words & • Classifies the Type of • Question • Respond with rephrased • “Questions” for which • “Ask Jeeves” knows the • Answer • Provide Additional Web • Sites as a fall back position • (a la --- a more traditional • web search engine)
Structured Knowledge-Base Approach • Create comprehensive • Knowledge Base(s) or • other Structured Data • Base(s) • At the 10K Axiom • Level -- Capable of • Answering factual • questions within • domain • At the 100K Axiom • Level -- Answer cause • & effect/capability • Questions • At the 1000K Axiom • Level -- Answer Novel • Questions; ID • alternatives Deepest QA but Limited to Given Subject Domain
Overarching Context / Operational Requirement Information Analysts Advanced Question Answering In a foreign news broadcast a team of analysts observe a previously unknown individual conferring with the Foreign Minister. They suspect that he/she is really a new senior advisor. What influence does he/she have on FM? Does this signal that other policy changes are coming? What are his/her views? What do we know about him/her? Who is this advisor? And still more questions ???
Judgement Questions? Predictive Questions? Interpretive Questions? Overarching Context / Operational Requirement Interpreting Complex QA Scenario within a Larger Context Why Questions? Other Questions? Factoid Questions? Voice Text System Specific Queries; Fully Tailored to Series of Questions Multi-Media Information Analysts Structured Other Extend Traditional Information Retrieval Ranked Lists of “Relevant” Data Objects Deeper Automated Understanding Extract & Analyze Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Multiple Heterogeneous Data Sources Advanced QA Provide Answers in a Form Analysts Want Interpret Results & Formulate the Answers Answers Advanced Question Answering
Advanced Question Answering Is Skipping Ahead Two Generations Multiple Key Barriers to Content Understanding Will Be Aggressively Attacked Commercial World & Current R&D Efforts Are Addressing the Next Generation But Only Selected Content Understanding Barriers Are Being Aggressively Attacked
AQUAINTAdvanced QUestion & Answering for INTelligence • What it is and What it is not . . . • Question & Answering Aimed at the “Information Professional” --- Not just the Casual User • Rich, Contextually-based Question Scenarios --- Full Range of Questions --- Not just Isolated, Factoid Questions • Places much higher premium on knowledge and reasoning across very broad domains • Open Domain, Multiple Media, Multiple Languages, Multiple Genre, Structured and Unstructured Data --- Not just a Focused Data Environment
Increasing Complexity Levels of Questions & Answers Level 1 Level 2 Level 3 Level 4 ”Simple "Template & “Cross Media & ”Context-Based QA Scenarios” Factual QA’s" Multi-valued QA’s” Cross Document QA’s" Near Term Long Term Mid Term Current Advanced QA:Ramping up to the Full Complexity of Questions & Answers
Today Level I Level II Future Level III Mulit-Valued Factual Questions Questions Cross Media Cross Document Simple Judgement Full Context-Based Question Scenario Full Context-Based Question Scenario Single Factual Isolated Questions Increasing Volumes (Petabyte & up) Data Chasm Synthesis Across “Documents”/Media Contradictory Data MANY Heterogeneous Data Sources; All Types, Sizes, Locations Multiple Perspectives Missing Data Reliability of Data & Source Answers Fully Intersected; Automatically Generated; Variable Structure/ Format; Full Context Responses Fully Intersected; Automatically Generated; Variable Structure/ Format; Full Context Responses Variable Narrative Summary; Multi-Media Presentations; Simple Interpreted Results 50/250 Byte Passage from Single Text Document Fixed Templates or Tabular Lists Advanced QA:Attacking the Data Chasm
Structured / Semi-Structured “Tagged Data” (e.g. Web Data) KB’s DB’s Advanced QA:Complex QA Across Data Types Unstructured Technical / Abstract Visual Data Sensor Geospatial Still Images Video Economic Other Human Language Media Genre Language Newswire / News Broadcast Text English Foreign Language 1 Documents Technical Foreign Language 2 Speech Formal / Informal Communication Multi-Media Foreign Language N Other
Advanced QA:Much Deeper Understanding of Human Language is Required • Some times SMALL differences can produce significantly different results/interpretations: • Stop Words • “Books {by; for; about} kids” • Attachments • “The man saw the woman in the park with the telescope.” • Co-reference • “John {persuaded; promised} Bill to go. He just left.” • “Mary took the pill from the bottle. She swallowed it.” • Other times BIG differences can produce the same/similar results: • “Name the films in which Jude Law starred.” • “Jude Law played a leading role in which movies?” • “In what Hollywood productions did Jude Law receive top billing?”
March April May June July August Advanced QA:Is Time Our Achilles Heel? • Real Difficulties Exist in: • Extracting, correctly interpreting time references & then creating manageable timelines • Estimating & updating changing reliability of information over time • Processing information in time sequence e.g. Tracking the details of an evolving event over time -- A whole different set of problems • And of course: • We can’t forget all of the issues related to the timeliness of the system’s response to our question(s) -- we’ll need at least “near real time responses”
Collector 5 picks up historical event planning material in a raid Collector 4 observe event-related info & reports Collector 1 observes event planning & reports Collectors 1,2,3 observe event Collector 1 reports Collector 2 reports Collector 3 reports H-n H+1 H-1 H-n Event Event Planning Aftermath of the Event H Hour Advanced QA:The Challenge of Time in Analysis • Different sources do not report simultaneously on an event. • Data from different sources may be near real-time or take years to arrive. • The hypothesis of today may be thrown out by new data arriving next week. • Analysis is dependent on a time continuum where data on a future event is found in the historical patterns established in event planning stages. As incoming data is evaluated against historical data, outcomes may change.
DIMENSIONS OF THE QUESTION DIMENSIONS OF THE ANSWER PART OF THE QA PROBLEM PART OF THE QA PROBLEM Multiple Scope Sources Advanced Advanced Simple Simple QA QA Answer, Factual R&D Single R&D Question Program Program Source Interpretation Judgement Fusion Context Increasing Knowledge Requirements ** Increasing Knowledge Requirements ** Advanced QA:The Need for Ever Increasing Knowledge -- Of All Types ** Knowledge Requirement would be better represented with a whole “quiver of arrows” of different sizes, lengths and types
A Different Paradigm may be useful when handling QA Scenarios: Current Analytic Paradigm: • Sequentially “Filter Down” to the • final result • Cast a “wider net” while searching • for “golden nuggets” (Answers) Data Background Processing & Analysis Discarded Answers Space of Data Objects and Sources Results • Automatically Extract, Represent, • and Preserve “closely related” • background information within • context of the QA Scenario Advanced QA:The Need for a Different Paradigm How Wide to Cast the “Net”? What Info to Retain? In what form? For how long? • Works when QA’s are • independent, isolated activities
In a foreign news broadcast a team of analysts observe a previously unknown individual conferring with the Foreign Minister. They suspect that he/she is really a new senior advisor. What influence does he/she have on FM? Does this signal that other policy changes are coming? FOCUS What are his/her views? What do we know about him/her? Who is this advisor? And still more questions ??? Overarching Context / Operational Requirement Information Analysts Advanced QA:Need for Improved Reasoning & Learning
Follow-up Leads Follow-up Leads Associates Associates Education TV & Radio Broadcasts, Newspapers & Other Archives Past Positions Raw “Bio” Information Collected Views Family Travels New Senior Advisor Other Activities Cross Fertilization Summarized Results Summarized Results “Views: Past & Present” .….… ….….. .……. ….….. .……. ….….. .……. ….….. .……. ….….. “Bio” ………..…. ……..……. ………..…. ……..……. ………..…. ……..……. …………... Advanced QA:Need for Improved Reasoning & Learning • Advanced Reasoning: • Use Multi-level Plans • Create and evaluate • chains of reasoning • Reason across hetero- • geneous data sources • Infer answers from • data extracted from • multiple sources when • the answer is not • explicitly stated • Utilize Link Analysis & • Evidence Discovery • Plus other strategies • Advanced Learning: • Automatically • learn new or modify • existing reasoning • strategies
ARDA’s Info-X Program Partners Active IC / Government Partners Interested External Stakeholders • Recent • Additions • NGIC • DHS
Knowledge Bases; Technical Databases Partially Annotated & Structured Data Other Analysts Supplemental Use Question & Requirement Context; Analyst Background Automatic Metadata Creation QUESTION ???? KB Queries Knowledge Query Multiple Source Specific Queries Translate Queries into Source Specific Retrieval Languages Assessment, Natural Statement of Advisor, Question; Queries Collaboration Use of Answer Context Single, Merged Ranked List of Relevant “Documents” Question Under- standing and Interpretation Multimedia Examples Question & Answer Context Multiple Ranked Lists Clarification Supple- mental Use Relevant “Documents” Relevant “Knowledge” FINAL ANSWER Relevant information • Analyst Feed- back Proposed Answer extracted and combined Query Refinement Multiple Sources; Multiple Media; Multi-Lingual; Multiple Agencies where possible; based on Analyst Accumulation of Knowledge Feedback • across “Documents” • Cross “Document” • Formulate Answer for • Analyst in form they want • Multimedia Navigation • Tools for Analyst Review Results of Analysis Summaries created; Language/Media • Determine the Answer Independent Concept Iterative Refinement of Results based on Analyst Feedback Representation Inconsistencies noted; • Answer Formulation Proposed Conclusions • and Inferences Generated AQUAINT:R&D Focused on Three Functional Components Operational Requirement / Cognitive Environment
Component Integration and System Architecture Issues Component Level / End-to-End Testing & Evaluation Separate Coordinated Activities QUESTION ???? Question Under- standing and Inter- pretation Information Retrieval Process FINAL ANSWER AQUAINT Phase I Solicitation Analysis & Synthesis Process Answer Formulation Determine the Answer Cross Cutting/Enabling Technologies Research Issues Annotated and ‘Ground Truthed’ Data AQUAINT:Separate, Coordinated Activities
Carnegie Mellon Univ. (2) Carnegie Mellon Univ. Univ. of Colorado-Boulder CoGen Tex Univ. of Massachusetts Univ. of Albany IBM Univ. of California- Berkeley BBN (2) Columbia Univ. Stanford Univ. Rutgers Univ. Princeton Univ. SRI Univ. of Southern California / Info Science Institute Univ. of Maryland – Baltimore County (UMBC) Univ. of Texas-Dallas SAIC Original Univ. of Southern California / Info Science Institute Cycorp + New HNC Software Language Computer Corp. (2) New Mexico State University (2) Language Computer Corp. AQUAINT Program Contractors
Univ. of Pittsburgh Univ. of Albany Cornell Univ. Univ. of Illinois- Urbana-Champaign Univ. of Colorado Univ. of Utah IBM T. J. Watson Center Carnegie Mellon Univ. (2) MITRE BBN UC-Berkeley (ICSI) MIT Brandeis Univ. Palo Alto Research Center Lehman College Columbia Univ. Stanford Univ. (2) USC / ISI (2) Rutgers Univ. USC Monmouth Univ. Arizona State Univ. Texas Tech Cycorp Univ. of Texas At Dallas Princeton Univ. SPAWAR Language Computer Corporation (2) Prime Contractors (18) Univ. of Pennsylvania Georgetown Univ. Sub Contractors (16) AQUAINT Program Phase 2 Contractors
AQUAINT Phase 2 Projects (Spring 04 – Spring 06) Total End-to-End Systems (10) (Systems 1-5)
AQUAINT Phase 2 Projects (Spring 04 – Spring 06) Total End-to-End Systems (10) (Systems 6-10)
AQUAINT Phase 2 Projects (Spring 04 – Spring 06) Emphasis on One or more Advanced QA System Components (2)
AQUAINT Phase 2 Projects (Spring 04 – Spring 06) Focused Effort -- Cross Cutting / Enabling Technologies (6)
AQUAINTAdvanced QUestion & Answering for INTelligence HIGHLIGHTS • Dramatic progress on linguistic approach that converts question and relevant passages into logical forms and then arrives at answer through a powerful combination of an extended “WordNet” and a logic prover
AQUAINTAdvanced QUestion & Answering for INTelligence HIGHLIGHTS • Dramatic progress on linguistic approach that converts question and relevant passages into logical forms and then arrives at answer through a powerful combination of an extended “WordNet” and a logic prover • Made significant strides in extending QA from isolated, factoid questions to far more complex “Who is / What is” questions that require combining information from multiple, potentially duplicative or contradictory document sources
More Complex Question Types • Definitions • What is Tikrit? • Biographies • Who is Mahmoud Abbas? • Events • What happened in Baghdad on Thanksgiving? • Different Perspectives / Opinions • What people think of Mahmoud Abbas’ resignation? • Lists • What names of chewing gums are found in the AQUAINT corpus? • Relationships • The analyst is interested in the line of succession of the Saudi government, and the relationship between the individuals in their royal family. King Fahd is the current ruler, but is in poor health. Who is next in line, and what is his relationship to King Fahd? Who, if anyone, has been designated as second in line?
Example Definition * What is Tikrit? Tikrit is a power center for Sunni Arab tribes that may hold out for as long as possible out of fear of losing power to the nation’s Shiite majority (12). Baghdad may be the capital of Iraq, but Tikrit is Saddam country (15). Other experts caution that the years of preferential treatment towards the residents of Tikrit may cause them to stand by Saddam Hussein to the end (4). … * Reference: Columbia Univ. / Univ. of Colorado AQUAINT Briefing
AQUAINTAdvanced QUestion & Answering for INTelligence HIGHLIGHTS • Dramatic progress on linguistic approach that converts question and relevant passages into logical forms and then arrives at answer through a powerful combination of an extended “WordNet” and a logic prover • Made significant strides in extending QA from isolated, factoid questions to far more complex “Who is / What is” questions that require combining information from multiple, potentially duplicative or contradictory document sources • Progress made on developing multi-engine QA system that combines linguistic, statistical & KB approaches
Available Answering Agents • Predictive Annotation Agent • General-purpose agent, used in almost all cases. • Statistical Query Agent • Also general-purpose. Courtesy Roukos/Ittycheriah • Description Agent • Generic descriptions (appositions, parentheticals etc.) • Structured Knowledge Agent • Answers from WordNet/KSP/Cyc • Pattern-Based Agent • Looks for specific syntactic patterns based on semantic form • Dossier Agent • Calls PIQUANT recursively with multiple factoid questions • Profile Agent • Currently standalone – used for Relationship Pilot * Reference: IBM AQUAINT Briefing
Knowledge Source Portal Answering Agents QPlan QGoals AQUAINT Question Predictive Predictive Annot Annot . . Generator Answering Agent Answering Agent Analysis Semantic Question TREC Statistical Statistical Search QFrame Answer Answering Agent Answering Agent Classification EB Keyword Definitional Q Definitional Q QPlan Answering Agent Answering Agent Search Executor CNS KSP KSP - - Based Based Answering Agent Answering Agent WordNet Pattern Pattern - - Based Based Answering Agent Answering Agent Cyc Web Answers Answer Answer Resolution PIQUANT Architecture * * Reference: IBM AQUAINT Briefing
Multiple QA Agents Approach *What is the largest city in England? • Text Match • Find text that says “London is the largest city in England” (or paraphrase). Confidence is confidence of NL parser * confidence of source. • “Superlative” Search • Find a table of English cities and their populations, and sort. • Find a list of the 10 largest cities in the world, and see which are in England. • Uses logic: if L > all objects in set R then L > all objects in set E < R. • Find the population of as many individual English cities as possible, and choose the largest. • Heuristics • London is the capital of England. (Not guaranteed to imply it is the largest city, but this is very frequently the case.) • Complex Inference • E.g. “Birmingham is England’s second-largest city”; “Paris is larger than Birmingham”; “London is larger than Paris”; “London is in England”. * Reference: IBM AQUAINT Briefing
AQUAINTAdvanced QUestion & Answering for INTelligence HIGHLIGHTS • Dramatic progress on linguistic approach that converts question and relevant passages into logical forms and then arrives at answer through a powerful combination of an extended “WordNet” and a logic prover • Made significant strides in extending QA from isolated, factoid questions to far more complex “Who is / What is” questions that require combining information from multiple, potentially duplicative or contradictory document sources • Progress made on developing multi-engine QA system that combines linguistic, statistical & KB approaches • Executed Pilot Evaluations for multiple complex QA Types; Developed Metrics for evaluating QA Systems at the Scenario Task Level; Full Evaluation of all End-to-End QA Systems late in Phase 2
Your Questions & Comments June Sunrise over Kirkwall Bay in the Orkney Islands of Scotland
Contact Information Dr. John Prange, Info-X R&D Thrust Program Director Dr. Lynn Franklin, Info-X R&D Thrust Program Dep Dir • Web Pages: http://www.ic-arda.org (Internet) • Phones: 443-479-8006 (Prange) 443-479-6604 (Franklin) 301-688-7092 (ARDA Office) 800-276-3747 (ARDA Office) • FAX: 301-688-7401 (ARDA Office) • E-Mail: arda@nsa.gov (Internet E-Mail) jprange@nsa.gov (Internet E-Mail) lynn.franklin@pnl.gov (Internet E-Mail) • Location: Room 12A69 NBP #1 Suite 6644 9800 Savage Road Fort Meade, MD 20755-6644