430 likes | 450 Views
Knowledge-Based Question-Answering. Bruce Porter, Peter Clark, and John Thompson. Terminology and Key Points. Knowledge base - a formal representation of knowledge with associated inference methods
E N D
Knowledge-Based Question-Answering Bruce Porter, Peter Clark, and John Thompson
Terminology and Key Points • Knowledge base - a formal representation of knowledge with associated inference methods • Explanation Generation = select, organize, and present information to answer a question. Inference plays a supporting role in each step. • Key Points: • Ways of evaluating explanation systems • Architectural principles for explanations systems • Composition: a way to reduce the up-front “knowledge requirement”
Botany Knowledge Base • 10 year effort by full-time domain expert and support staff • contains 20,000 concepts and 100,000 facts • much more information available via inheritance and rules • performance goal: robust, expert-level ability to answer questions with good explanations
An Example of Explanation Generation: The KNIGHT System(J. Lester and B. Porter, Developing and Empirically Evaluating Robust Explanation Generators, Computational Linguistics Journal, 23(1), 1997.) Q: What happens during embryo sac formation? A: Embryo sac formation is a kind of female gametophyte formation. During embryo sac formation, the embryo sac is formed from the megaspore mother cell. Embryo sac formation occurs in the ovule. Embryo sac formation is a step of angiosperm sexual reproduction. It consists of megasporogenesis and embryo sac generation. During megasporogenesis, the megaspore mother cell divides in the nucellus to form 4 megaspores. During embryo sac generation, the embryo sac is generated from the megaspore.
KNIGHT approach: 1. Representation: of the situation which user is asking about 2. Elaboration: of that representation, guided by an answer schema (EDP) 3. Assembly: of results into natural language text 4. Presentation
(virtual) KB provided by base factsBKB views view retriever explanations KNIGHT information user requests ... worked well to provide an “arms length relationship” between application programs and the KB Knight System Architecture
View Retriever(L. Acker and B. Porter, Extracting Viewpoints from Knowledge Bases, AAAI-94) • given a specification of desired information • return a subgraph of the knowledge base representing a coherent, comprehensive set of facts pertinent to the specification
Production product location Substance energy source Place raw materials product Photosynthesis Thing producer energy source Oxygen Substance Glucose location raw materials Thing Chloroplast ATP producer Water Carbon-Dioxide Photosynthetic Cell The Viewpoint of Photosynthesis as Production(L. Acker and B. Porter, Extracting Viewpoints from Knowledge Bases, AAAI-94)
Angiosperm Sexual Reproduction Pollen Grain Formation Embryo Sac Formation Flower Pollen Grain Transfer Androecium Pollen Grain Germination Gynoecium Double Fertilization A Combination Viewpoint: Flower Structure vis-à-vis Plant Reproduction location subevents location has parts source surrounds location destination location location
Explanation Design Plan for Processes Explain Process Process Overview Fates of patients Temporal information Process details As-kind-of viewpoint Location description Temporal step-of viewpoint For each subevent: Black-box viewpoint Black-box viewpoint For each patient: change viewpoint Temporal steps viewpoints Nodes contain programs with iteration and conditionals
KNIGHT Evaluation Questions (60) (60) (15) (15) (15) (15) KNIGHT Biologist Biologist Biologist Biologist explanations Panel of Judges: 8 Biologists Evaluations
Author Overall Content Organization Writing Correctness KNIGHT 2.37±0.13 2.65±0.13 2.45±0.16 2.40±0.13 3.07±0.15 Human 2.85±0.15 2.95±0.16 3.07±0.16 2.93±0.16 3.16±0.15 Results of the Evaluation Overall Content Organization Writing Correctness Difference 0.48 0.30 0.62 0.53 0.09 T statistic -2.36 -1.47 -2.73 -2.54 -0.42 Significance 0.02 0.14 0.07 0.01 0.67 Significant? yes no no yes no
Answer (KB-generated): • First, Payday queries the cell directory server for the network-id of Oracle. • Then Payday queries the endpoint mapper of Speedy for Oracle’s endpoint. • Finally, Payday assembles a binding from the network-id and the endpoint. Another example (DCE Application) Question (user): Describe a binding event, between - the client Payday running on Slowbox - the server Oracle running on Speedy
host Slowbox Oracle Speedy server host client Payday Binding-Event01 1. Representation of situation in question Describe a binding event, between - the client Payday running on Slowbox - the server Oracle running on Speedy
host Slowbox Network01 Oracle Speedy cds ? CDS01 server network client Payday Binding-Event01 ? request subevents subevents ? queried queried agent Query01 Query02 ? Assemble01 then then Schema/EDP (paraphased): “For each subevent, present summary, and pointers to sub-subevents.” 2. Elaboration (guided by answer schema)
host Slowbox Network01 Oracle Speedy cds CDS01 server network client Payday Binding-Event01 subevents queried Query01 Query02 Assemble01 then then Schema/EDP (paraphased): “For each subevent, present summary, and pointers to sub-subevents.” 2. Elaboration (guided by answer schema) endpoint id epm Endpoint01 NetId01 request request agent agent components queried Endpoint Mapper01
3. Assembly of text answer host Slowbox Network01 Oracle Speedy • “First” (the agent of Query01) “queries” (the queried of Query01) “for” (the request of Query01) cds CDS01 server endpoint id host network epm client Payday Binding-Event01 Endpoint01 NetId01 request request subevents agent agent queried components queried Query01 Query02 Assemble01 Endpoint Mapper01 then then • “First, Payday queries the cell directory server for the network-id of Oracle.”
The Application Environment (Hyperlinked text) (run-time generated pages)
Critique • Approach used in Botany KB & three smaller applications • Benefits: • Customized answers • Controllable level of detail • Flexibility (in theory) • Well received, but: • KBs still highly incomplete • laborious to build • difficult to achieve reuse want more modular approach
A Component-Based Approach to Knowledge-Base Construction Obervation: Concept representations contain numerous abstractions Approach: 1. Component theories = abstract, reusable models 2. More specific concepts: specified as compositions 3. Inference = construct compositions as needed to answer questions.
Most abstract concepts appeal to core, foundational theories Move: to Go Go: to Move Transport: to Move from one Place to another Vehicle: a Means for Transporting something Specific concepts defined as compositions of abstract concepts Lessons from a Dictionary... Car: a Vehicle forPassengers
1. Component Theories • A coherent, encapsulated system of concepts and relations • Contains: • ontology (vocabulary of concepts and relations) • axioms (rules) relating these • Provides semantics for these concepts in the KB • Can define specific theories using general ones
Fuel Cells Switches Light Motor Example: Electrical Circuits Electrical Circuit • Carries electricity • If closed circuit from Fuel Cell to Device, then Device is powered • Switches can open/close the circuit
Fuel Cells Switches Light Motor Example: Electrical Circuits Electrical Circuit Distribution-Network P P I C I C • Carries electricity • If closed circuit from Fuel Cell to Device, then Device is powered • Switches can open/close the circuit • Carries transport-element • If unblocked path from Producer to Consumer, then Consumer is supplied. • connects is transitive • ….
Fuel Cells Switches Light Motor Circuits as Distribution Networks Electrical Circuit Distribution-Network P P I C I C • Carries electricity • If closed circuit from Fuel Cell to Device, then Device is powered • Switches can open/close the circuit • Carries transport-element • If unblocked path from Producer to Consumer, then Consumer is supplied. • connects is transitive • ….
P P I C I C Distribution Networks as DAGS Distribution-Network Blockable-DAG Imports: Blockable-DAG N1 N2 N3 N5 N4 N6 And: • Nodes can connect with other nodes. • X reaches Y if X connects with Y. • X reaches Z if X connects with Y and Y reaches Z • …. • Producers, Intermediaries, and Consumers are Nodes • If unblocked path from Producer to Consumer, then Consumer is supplied. • ...
Component theories in KB-PHaSE DAG Blockable DAG Discrete Event Model Processing Network Distribution Network Two-state Object Optical Circuits Electrical Circuits Machines Spatial Relns PHaSE KB Ontology, compositions, basic facts about the domain
2. Composition • Describe domain-specific concepts as compositions: • a Bulb is a Resistor to Electricity producing Light • a Camera is a Device for the Recording of Images • a Battery is a Producer of Electricity • a Wire is a Conduit of Electricity • Inference:compute properties of compound concept • using axioms from each component • on demand, in response to questions
Device behavior input Recording Image 2. Composition (example) Composition:Camera = a Device for the Recording of Images Query:Failure modes of a camera? (Camera has (superclasses (Device))) (every Camera has (behavior ((a Recording with (input (Image)))))
failure- mode failure- mode Failure- Mode Failure- Mode Device Device behavior failure- mode failure- mode behavior Activity input Recording Image failure- mode failure- mode Failure- Mode participants Failure- Mode Physobj Physobj failure- mode failure- mode part. part. Physobj Physobj Component Theory: Devices (Device has (superclasses (Physobj))) (every Device has (behavior ((a Activity))) (failure-modes ( (the failure-modes of (the participants of the behavior of Self))))))
Device input Recording Signal Signal behavior output participant input input input participant Recording Image failure- mode failure- mode Failure- Mode Receptor Failure- Mode Memory-Unit agent patient failure- mode failure- mode Receiving Writing Signal part. part. output input part. part. input Physobj Receptor subevents subevents Physobj Memory-Unit agent patient Receiving Writing Component Theory: Recording (Recording has (superclasses (Activity))) (every Recording has (input ((a Signal))) (participants ( (a Receptor with (input ((the input of Self))) ...
failure- mode failure- mode Failure- Mode Failure- Mode Device Run-Time Classification: Aperture = a Receptor of Images behavior failure- mode failure- mode input Recording Image Signal part. output input part. input Receptor Memory-Unit subevents agent patient Receiving Writing
Blockage Blockage failure- mode failure- mode Image Image Image Image output output input input Aperture Aperture Aperture - inputs an image - outputs an image - might be blocked - ... failure- mode failure- mode Failure- Mode Failure- Mode Device Run-Time Classification: Aperture = a Receptor of Images behavior failure- mode failure- mode input Recording Image Signal part. output input part. input Receptor Memory-Unit subevents agent patient Receiving Writing
failure- mode failure- mode Blockage Device Aging behavior failure- mode failure- mode failure- mode input Recording Image Image part. output sensitive-to input part. input input Chemical Aperture Memory-Unit subevents covering agent patient Sheet Receiving Writing Query:Failure modes of a camera? Blockage, ... Sub-query:Participants in its behavior? Aperture, ...
Compound Concepts are Ubiquitous • Botany: • photosynthesis • plant material distribution • ... • Aerospace: • turbine gearbox assembly • case drain fluid • …(43k acronyms!)… • Sentences also: • “The aircraft overshot the runway.” • “The air-conditioning unit had no power.” • ...
2. Component theories DAG Process Network Optical Circuits Blckable DAG Distrn Network Elec. Circuits Discrete events 2-state Object Machine Overall Architecture 1. Ontology Thing ... ... ...
Overall Architecture 2. Component theories 1. Ontology Thing DAG Process Network Optical Circuits Blckable DAG Distrn Network Elec. Circuits ... ... ... Camera = a Device for the Recording of Images 3. Definitions and Descriptions
PH. Science Checklists PH. Circuit PHaSE physical structure Overall Architecture 2. Component theories 1. Ontology Thing DAG Process Network Optical Circuits Blckable DAG Distrn Network Elec. Circuits ... ... ... 3. Definitions and Descriptions Camera = ... 4. Basic facts about domain
Summary • Explanation Generators select, organize, and present information in response to questions. • Inference plays a supporting role in each step. • Explanation Design Plans are built for each type of explanation. • Composition at run-time reduces the up-front “knowledge requirement”
Discussion • Technical: The component approach is still a work-in-progress; in particular although we can isolate the general theories, the “basic facts” can still be highly interdependent. • Philosophical: We need a library of reusable components. Will the idiosyncrasies of real-world concepts overwhelm the generality of patterns?