470 likes | 618 Views
Trusting Answers Text Analysis Applications: Inference Web Approach. Deborah McGuinness Co-Director and Senior Research Scientist Knowledge Systems, Artificial Intelligence Laboratory Stanford University dlm@ksl.stanford.edu http://www.ksl.stanford.edu/people/dlm
E N D
Trusting Answers Text Analysis Applications: Inference Web Approach Deborah McGuinness Co-Director and Senior Research Scientist Knowledge Systems, Artificial Intelligence Laboratory Stanford University dlm@ksl.stanford.edu http://www.ksl.stanford.edu/people/dlm Inference Web is joint work with Pinheiro da Silva, Fikes, Chang, Deshwal, Narayanan, Glass, Makarios, Jenkins, Millar, …
Background AT&T Bell Labs AI Principles Dept • Description Logics, CLASSIC, explanation, ontology environments • Semantic Search, FindUR, Collaborative Ontology Building Env • Apps: Configurators, PROSE/Questar, Data Mining, … Stanford Knowledge Systems, Artificial Intelligence Lab • Ontology Evolution Environments (Diagnostics and Merging) Chimaera • Explanation and Trust, Inference Web • Semantic Web Representation and Reasoning Languages, DAML-ONT, DAML+OIL, OWL, • Rules and Services: SWRL, OWL-S, Explainable SDS, KSL Wine Agent `McGuinness Associates • Ontology Environments: Sandpiper, VerticalNet, Cisco… • Knowledge Acquisition and Ontology Building – VSTO, GeON, ImEp,… • Applications: GM: Search, etc.; CISCO : meta data org, etc.; • Boards: Network Inference, Sandpiper, Buildfolio, Tumri, Katalytik
Semantic Web Layers Ontology Level • Languages (CLASSIC, DAML-ONT, DAML+OIL, OWL, …) • Environments (FindUR, Chimaera, OntoBuilder/Server, Sandpiper Tools, …) • Standards (NAPLPS, …, W3C’s WebOnt, W3C’s Semantic Web Best Practices, EU/US Joint Committee, OMG ODM, … Rules • SWRL (previously CLASSIC Rules, explanation environment, extensibility issues, contracts, …) Logic • Description Logics Proof • PML, Inference Web Services and Infrastructure Trust • IWTrust, NSF proposals with W3C/MIT and IHMC http://www.w3.org/2004/Talks/0412-RDF-functions/slide4-0.html
Motivation – Trust and Understanding If users (humans and agents) are to use, reuse, and integrate system answers, they must trust them. System transparency supports understanding and trust. Even simple “lookup” systems benefit from providing information about their sources. Systems that manipulate information (with sound deduction or potentially unsound heuristics) benefit from providing information about their manipulations. Goal: Provide interoperable infrastructure that supports explanations of sources, assumptions, and answers as an enabler for trust.
Requirements gathered from… DARPA Agent Markup Language (DAML) Enable the next generation of the web DARPA Personal Assistant that Learns (PAL) Enable computer systems that can reason, learn, be told what to do, explain what they are doing, reflect on their experience, & respond robustly to surprise DARPA Rapid Knowledge Formation (RKF) Allow distributed teams of subject matter experts to quickly and easily build, maintain, and use knowledge bases without need for specialized training DARPA High Performance Knowledge Base (HPKB) Advance the technology of how computers acquire, represent & manipulate knowledge ARDA Novel Intelligence for Massive Data (NIMD) Avoid strategic surprise by helping analysts be more effective (focus attention on critical information and help analyze/prune/refine/explain/reuse/…) ARDA Advanced Question & Answering for Intelligence (AQUAINT) Advance QA against structured and unstructured info Consulting including search, ecommerce, configuration, …
Requirements Information Manipulation Traces • hybrid, distributed, portable, shareable, combinable encoding of proof fragments supporting multiple justifications Presentation • multiple display formats supporting browsing, visualization, summaries,… Abstraction • understandable summaries Interaction • multi-modal mixed initiative options including natural-language and GUI dialogues, adaptive, context-sensitive interaction Trust • source and reasoning provenance, automated trust inference [McGuinness & Pinheiro da Silva, ISWC 2003, J. Web Semantics 2004]
Selected History • Historical explanation research motivated by explaining theorem provers in practice • Web version originally aimed at explaining hybrid (FOL / special purpose) reasoners in a distributed environment like the web. • User demand drove focus on provenance extensions • Current web environment and programs, such as NIMD, drove connections with extraction engines • Current view: Any question answering system can be viewed as some kind of information manipulator that may benefit from and/or require explanation
Inference Web * Framework for explaining question answering tasks by abstracting, storing, exchanging, combining, annotating, filtering, segmenting, comparing, and rendering proofs and proof fragments provided by question answerers • IW’s Proof Markup Language (PML) is an interlingua for proof interchange. It is written in W3C’s recommended Ontology web language (OWL) • IWBase is a distributed repository of meta-information related to proofs and their explanations • IW Registration services provide support for proof generation and checking • IW Browser provides display capabilities for PML documents containing proofs and explanations (possibly from multiple inference engines) • IW Abstractor provides rewriting capabilities enabling more understandable presentations • IW Explainer provides multi-modal dialogue options including alternative strategies for presenting explanations, visualizations, and summaries *Work with Pinheiro da Silva
Explainable System Structure Explanation Understanding Trust Interaction Presentation Abstraction PML Proof Interlingua (PML) InferenceML Information Manipulation Data Information Manipulation Data Source Provenance Data Source Provenance Data Inference Rule Specs
Registry Information IWBase has core and domain-specific repositories of meta-data useful for disclosing knowledge provenance and reasoning information such as descriptions of • Question answering systems (Inference Engines, Extractors, …) along with their supported inference rules • Information sources such as organizations, publications and ontologies • Representation languages along with their axioms
Enable the visualization of proofs (and abstracted proofs) Proofs can be “extracted” and browsed from both local and remote PML node sets and can be combined Links provide access to proof-related meta-information Browsing Proofs select select
Explainer Present • Query • Answer • Abstraction of justification (PML information) • Limited meta information • Suggests drill down options (also provides feedback options)
Explaining Extracted Entities (Techies) Sentences in English Sentences in annotated English Sentences in logical format, i.e., KIF
Further Observations on Explaining Extracted Entities Source: fbi_01.txt Source Usage: span from 01 to 78 Same conclusion from multiple extractors conflicting conclusion from one extractor This extractor decided that Person_fbi-01.txt_46 is a Person and not Occupation
Knowledge Provenance Elicitation “has opinion” “has opinion” “has opinion” BBC NYT CNN DA DA DA A->(A^B) A B MP ^I DA A^B A->(A^B) A A^B A B A^B Dir.Ass. MP ^I A^B (CNN,BBC) (BBC,NYT) (CNN) XYZ says ‘A^B’ is the answer for my question. Provenance information may be essential for users to trust answers. Data provenance (aka data lineage) is defined and studied in the database literature. [Buneman et al., ICDT 2001] [Cui and Widom, VLDB 2001] Why should I believe this? Knowledge provenance extends data provenance by adding data derivation provenance information [Pinheiro da Silva, McGuinness & McCool, Data Eng. Bulletin, 2003]
IWTrust: Trust in Action FSP NYT CNN DA DA DA A->(A^B) A B MP ^I DA A^B A->(A^B) A A^B A B A^B DA MP ^I B (CNN,FSP) (FSP,NYT) (CNN) Google-2.0 says ‘A^B’ is the answer for my question. Trust can be inferred from a Web of Trust. ++ Why should I trust the answer? ? ++ 0 IWTrust provides infrastructure for building webs of trust. + The infrastructure includes a trust component responsible for computing trust values for answers. IWTrust is described in [Zaihrayeu, Pinheiro da Silva & McGuinness, iTrust 2005] A^B 0 + ? ? + ++ 0
Explanation Application Areas Theorem proving First-Order Theorem Provers – Stanford (JTP (KIF/OWL/…)); SRI (SNARK); University of Texas, Austin (KM); SATisfiability Solvers – University of Trento (JSAT) Information extraction – IBM (UIMA), Stanford (TAP) Information integration/aggregation – USC ISI (Prometheus,Mediator -> Fetch); Rutgers , Stanford (TAP) Task processing – SRI International (SPARK) Service composition – Stanford, U. of Toronto, UCSF (SDS) Semantic matching – University of Trento (S-MATCH) Debugging ontologies – U of Maryland, College Park (SWOOP/Pellet)* Problem solving – University of Fortaleza Trust Networks – U. of Trento (IWTrust), UMd*
Inference Web Contributions 6 5 2 4 3 2 1 4 4 • Language for encoding hybrid, distributed proof fragments based on web technologies. Support for both formal and informal proofs (information manipulation traces). Explanation Understanding Trust Interaction 2. Support (registry, language, services) for knowledge provenance Presentation Abstraction 3. Declarative inference rule representation for checking hybrid, distributed proofs. Proof Markup Language Inference Meta-Language Information Manipulation Data Provenance Meta-data Inference Rule Specs 4. Multiple strategies for proof abstraction, presentation and interaction. 5. End-to-end trust value computation for answers. 6. Comprehensive solution for explainable systems
Status Inference Web infrastructure (PML, browser, explainer, registry, toolkit) being used in government programs such as PAL and NIMD, commercial research labs – IBM, Boeing, SRI, Universities – USC, U MD, … Integration and registration process underway with extraction community Useful now for helping decide if information is trustworthy, comes from authoritative sources, consistent, reliable Benefits from more meta data and more information population but is useful in an incremental nature
Technical Status Some focus areas: Follow-up question support Trust Contradiction support Abstraction techniques Extraction extensions Task-oriented reasoning support Query manager explanation support Toolkit for embedding Open issues for explanation Granularity of explanations Meta information filtering Abstraction techniques Requests / Suggestions?
More Info: Inference Web: http://iw.stanford.edu OWL: http://www.w3.org/TR/owl-features/http://www.w3.org/TR/owl-guide/ DAML+OIL: http://www.daml.org/ Chimaera: http://www.ksl.stanford.edu/software/chimaera/ OWL-QL/DQL: http://www.ksl.stanford.edu/projects/dql/ UIMA: http://www.research.ibm.com/UIMA/ dlm@ksl.stanford.edu
KSL Wine Agent: Semantic Web Integration (Toy) Example • Uses emerging web standards to enable “smart” web application • Given a meal description • Deborah’s Specialty, a crab dish, … • Describe matching wines • White, Dry, Full bodied… • Retrieve some specific options from web • Forman Chardonnay from DLM’s cellar, ThreeSteps from wine.com, …. • Explain description or specific suggestion • Info: http://www.ksl.stanford.edu/people/dlm/webont/wineAgent/
KSL Wine Agent Semantic Web Integration Technology • OWL: for representing a domain ontology of foods, wines, their properties, and relationships between them • JTP theorem prover: for deriving appropriate pairings • Chimaera: ontology diagnostics and ontology merging • DQL/OWL QL : for querying a knowledge base • Inference Web: for explaining and validating answers (descriptions or instances) • Web Services: for interfacing with vendors • Connections to online web agents/information services • Utilities for conducting and caching the above transactions
Knowledge Provenance Multiple Sources Answer Source Source
Chimaera: Trend: Ontology Environment • An interactive web-based tool aimed at supporting: • Ontology analysis (correctness, completeness, style, …) • Merging of ontological terms from varied sources • Maintaining ontologies over time • Validation of input • Features: multiple I/O languages, loading and merging into multiple namespaces, collaborative distributed environment support, integrated browsing/editing environment, extensible diagnostic rule language • Used in commercial and academic environments, basis of some commercial re-implementations (Ontobuilder/Ontoserver, Sandpiper, CISCO MDF, GM…) • Available as a hosted service from www-ksl-svc.stanford.edu • Information:www.ksl.stanford.edu/software/chimaera • More features required: versioning, diffs, ….
FindUR Architecture Content to Search: CLASSIC Knowledge Representation System Research Site Technical Memorandum Calendars (Summit 2005, Research) Yellow Pages (Directory Westfield) Newspapers (Leader) Internal Sites (Rapid Prototyping) AT&T Solutions Worldnet Customer Care Content (Web Pages or Databases Content Classification Domain Knowledge Domain Knowledge Search Technology: Search Engine Verity (and topic sets) GUI supporting browsing and selection Collaborative Topic Set Tool User Interface: Verity SearchScript, Javascript, HTML, CGI, CLASSIC Results (standard format) Results (domain specific) McGuinness – NPUC 8/4/2004
Trend: Semantic Search McGuinness – NPUC 8/4/2004
Inferences drawn by Information Extraction Document: CIA Report 117 Document: FBI Report 282 AER: (Person “Mr. Ramazi”) AER: (Org “Select Gourmet Foods”) AER: (Person “Abdul Ramazi”) BER: (Org “Select Gourmet Foods”) BER: (Person “Abdul Ramazi”) BER: (Org “SGF”) BRR: (hasOwner “SGF”, “Abdul Ramazi”) ARR: (hasOwner “Select Gourmet Foods”, “Mr. Ramazi”) MCR: (equals “Abdul Ramazi”, “Mr. Ramazi”, AbdulRamazi) MCR: (equals “Select Gourmet Foods”, “SGF”, SelectGourmetFoods) MCR: (hasOwner SelectGourmetFoods, AbdulRamazi)
Infrastructure: Core IWBase Statistics for relevant domain independent meta-data: Inference Engines 29 Axioms 56 Declarative Rules 38 select Method Rules 10 Derived Rules 6 Languages 12 select
Explaining Answers: GUI Explainer Users can exit the explainer providing feedback about their satisfiability with explanation(s) Select action Users can ask for alternative explanations or summaries
Knowledge Provenance Multiple Sources Answer Source Source