Trusting Answers Text Analysis Applications: Inference Web Approach

Trusting Answers Text Analysis Applications: Inference Web Approach Deborah McGuinness Co-Director and Senior Research Scientist Knowledge Systems, Artificial Intelligence Laboratory Stanford University dlm@ksl.stanford.edu http://www.ksl.stanford.edu/people/dlm Inference Web is joint work with Pinheiro da Silva, Fikes, Chang, Deshwal, Narayanan, Glass, Makarios, Jenkins, Millar, …

Background AT&T Bell Labs AI Principles Dept • Description Logics, CLASSIC, explanation, ontology environments • Semantic Search, FindUR, Collaborative Ontology Building Env • Apps: Configurators, PROSE/Questar, Data Mining, … Stanford Knowledge Systems, Artificial Intelligence Lab • Ontology Evolution Environments (Diagnostics and Merging) Chimaera • Explanation and Trust, Inference Web • Semantic Web Representation and Reasoning Languages, DAML-ONT, DAML+OIL, OWL, • Rules and Services: SWRL, OWL-S, Explainable SDS, KSL Wine Agent `McGuinness Associates • Ontology Environments: Sandpiper, VerticalNet, Cisco… • Knowledge Acquisition and Ontology Building – VSTO, GeON, ImEp,… • Applications: GM: Search, etc.; CISCO : meta data org, etc.; • Boards: Network Inference, Sandpiper, Buildfolio, Tumri, Katalytik

Semantic Web Layers Ontology Level • Languages (CLASSIC, DAML-ONT, DAML+OIL, OWL, …) • Environments (FindUR, Chimaera, OntoBuilder/Server, Sandpiper Tools, …) • Standards (NAPLPS, …, W3C’s WebOnt, W3C’s Semantic Web Best Practices, EU/US Joint Committee, OMG ODM, … Rules • SWRL (previously CLASSIC Rules, explanation environment, extensibility issues, contracts, …) Logic • Description Logics Proof • PML, Inference Web Services and Infrastructure Trust • IWTrust, NSF proposals with W3C/MIT and IHMC http://www.w3.org/2004/Talks/0412-RDF-functions/slide4-0.html

Motivation – Trust and Understanding If users (humans and agents) are to use, reuse, and integrate system answers, they must trust them. System transparency supports understanding and trust. Even simple “lookup” systems benefit from providing information about their sources. Systems that manipulate information (with sound deduction or potentially unsound heuristics) benefit from providing information about their manipulations. Goal: Provide interoperable infrastructure that supports explanations of sources, assumptions, and answers as an enabler for trust.

Requirements gathered from… DARPA Agent Markup Language (DAML) Enable the next generation of the web DARPA Personal Assistant that Learns (PAL) Enable computer systems that can reason, learn, be told what to do, explain what they are doing, reflect on their experience, & respond robustly to surprise DARPA Rapid Knowledge Formation (RKF) Allow distributed teams of subject matter experts to quickly and easily build, maintain, and use knowledge bases without need for specialized training DARPA High Performance Knowledge Base (HPKB) Advance the technology of how computers acquire, represent & manipulate knowledge ARDA Novel Intelligence for Massive Data (NIMD) Avoid strategic surprise by helping analysts be more effective (focus attention on critical information and help analyze/prune/refine/explain/reuse/…) ARDA Advanced Question & Answering for Intelligence (AQUAINT) Advance QA against structured and unstructured info Consulting including search, ecommerce, configuration, …

Requirements Information Manipulation Traces • hybrid, distributed, portable, shareable, combinable encoding of proof fragments supporting multiple justifications Presentation • multiple display formats supporting browsing, visualization, summaries,… Abstraction • understandable summaries Interaction • multi-modal mixed initiative options including natural-language and GUI dialogues, adaptive, context-sensitive interaction Trust • source and reasoning provenance, automated trust inference [McGuinness & Pinheiro da Silva, ISWC 2003, J. Web Semantics 2004]

Selected History • Historical explanation research motivated by explaining theorem provers in practice • Web version originally aimed at explaining hybrid (FOL / special purpose) reasoners in a distributed environment like the web. • User demand drove focus on provenance extensions • Current web environment and programs, such as NIMD, drove connections with extraction engines • Current view: Any question answering system can be viewed as some kind of information manipulator that may benefit from and/or require explanation

Inference Web * Framework for explaining question answering tasks by abstracting, storing, exchanging, combining, annotating, filtering, segmenting, comparing, and rendering proofs and proof fragments provided by question answerers • IW’s Proof Markup Language (PML) is an interlingua for proof interchange. It is written in W3C’s recommended Ontology web language (OWL) • IWBase is a distributed repository of meta-information related to proofs and their explanations • IW Registration services provide support for proof generation and checking • IW Browser provides display capabilities for PML documents containing proofs and explanations (possibly from multiple inference engines) • IW Abstractor provides rewriting capabilities enabling more understandable presentations • IW Explainer provides multi-modal dialogue options including alternative strategies for presenting explanations, visualizations, and summaries *Work with Pinheiro da Silva

Explainable System Structure Explanation Understanding Trust Interaction Presentation Abstraction PML Proof Interlingua (PML) InferenceML Information Manipulation Data Information Manipulation Data Source Provenance Data Source Provenance Data Inference Rule Specs

Registry Information IWBase has core and domain-specific repositories of meta-data useful for disclosing knowledge provenance and reasoning information such as descriptions of • Question answering systems (Inference Engines, Extractors, …) along with their supported inference rules • Information sources such as organizations, publications and ontologies • Representation languages along with their axioms

Enable the visualization of proofs (and abstracted proofs) Proofs can be “extracted” and browsed from both local and remote PML node sets and can be combined Links provide access to proof-related meta-information Browsing Proofs select select

Browsing Proofs

Explainer Present • Query • Answer • Abstraction of justification (PML information) • Limited meta information • Suggests drill down options (also provides feedback options)

UIMA Explanation

Follow-up : Metadata

Follow-up: Assumptions

Explaining Extracted Entities (Techies) Sentences in English Sentences in annotated English Sentences in logical format, i.e., KIF

Further Observations on Explaining Extracted Entities Source: fbi_01.txt Source Usage: span from 01 to 78 Same conclusion from multiple extractors conflicting conclusion from one extractor This extractor decided that Person_fbi-01.txt_46 is a Person and not Occupation

Knowledge Provenance Elicitation “has opinion” “has opinion” “has opinion” BBC NYT CNN DA DA DA A->(A^B) A B MP ^I DA A^B A->(A^B) A A^B A B A^B Dir.Ass. MP ^I A^B (CNN,BBC) (BBC,NYT) (CNN) XYZ says ‘A^B’ is the answer for my question. Provenance information may be essential for users to trust answers. Data provenance (aka data lineage) is defined and studied in the database literature. [Buneman et al., ICDT 2001] [Cui and Widom, VLDB 2001] Why should I believe this? Knowledge provenance extends data provenance by adding data derivation provenance information [Pinheiro da Silva, McGuinness & McCool, Data Eng. Bulletin, 2003]

IWTrust: Trust in Action FSP NYT CNN DA DA DA A->(A^B) A B MP ^I DA A^B A->(A^B) A A^B A B A^B DA MP ^I B (CNN,FSP) (FSP,NYT) (CNN) Google-2.0 says ‘A^B’ is the answer for my question. Trust can be inferred from a Web of Trust. ++ Why should I trust the answer? ? ++ 0 IWTrust provides infrastructure for building webs of trust. + The infrastructure includes a trust component responsible for computing trust values for answers. IWTrust is described in [Zaihrayeu, Pinheiro da Silva & McGuinness, iTrust 2005] A^B 0 + ? ? + ++ 0

Explanation Application Areas Theorem proving First-Order Theorem Provers – Stanford (JTP (KIF/OWL/…)); SRI (SNARK); University of Texas, Austin (KM); SATisfiability Solvers – University of Trento (JSAT) Information extraction – IBM (UIMA), Stanford (TAP) Information integration/aggregation – USC ISI (Prometheus,Mediator -> Fetch); Rutgers , Stanford (TAP) Task processing – SRI International (SPARK) Service composition – Stanford, U. of Toronto, UCSF (SDS) Semantic matching – University of Trento (S-MATCH) Debugging ontologies – U of Maryland, College Park (SWOOP/Pellet)* Problem solving – University of Fortaleza Trust Networks – U. of Trento (IWTrust), UMd*

Inference Web Contributions 6 5 2 4 3 2 1 4 4 • Language for encoding hybrid, distributed proof fragments based on web technologies. Support for both formal and informal proofs (information manipulation traces). Explanation Understanding Trust Interaction 2. Support (registry, language, services) for knowledge provenance Presentation Abstraction 3. Declarative inference rule representation for checking hybrid, distributed proofs. Proof Markup Language Inference Meta-Language Information Manipulation Data Provenance Meta-data Inference Rule Specs 4. Multiple strategies for proof abstraction, presentation and interaction. 5. End-to-end trust value computation for answers. 6. Comprehensive solution for explainable systems

Status Inference Web infrastructure (PML, browser, explainer, registry, toolkit) being used in government programs such as PAL and NIMD, commercial research labs – IBM, Boeing, SRI, Universities – USC, U MD, … Integration and registration process underway with extraction community Useful now for helping decide if information is trustworthy, comes from authoritative sources, consistent, reliable Benefits from more meta data and more information population but is useful in an incremental nature

Technical Status Some focus areas: Follow-up question support Trust Contradiction support Abstraction techniques Extraction extensions Task-oriented reasoning support Query manager explanation support Toolkit for embedding Open issues for explanation Granularity of explanations Meta information filtering Abstraction techniques Requests / Suggestions?

More Info: Inference Web: http://iw.stanford.edu OWL: http://www.w3.org/TR/owl-features/http://www.w3.org/TR/owl-guide/ DAML+OIL: http://www.daml.org/ Chimaera: http://www.ksl.stanford.edu/software/chimaera/ OWL-QL/DQL: http://www.ksl.stanford.edu/projects/dql/ UIMA: http://www.research.ibm.com/UIMA/ dlm@ksl.stanford.edu

Extras

KSL Wine Agent: Semantic Web Integration (Toy) Example • Uses emerging web standards to enable “smart” web application • Given a meal description • Deborah’s Specialty, a crab dish, … • Describe matching wines • White, Dry, Full bodied… • Retrieve some specific options from web • Forman Chardonnay from DLM’s cellar, ThreeSteps from wine.com, …. • Explain description or specific suggestion • Info: http://www.ksl.stanford.edu/people/dlm/webont/wineAgent/

KSL Wine Agent Semantic Web Integration Technology • OWL: for representing a domain ontology of foods, wines, their properties, and relationships between them • JTP theorem prover: for deriving appropriate pairings • Chimaera: ontology diagnostics and ontology merging • DQL/OWL QL : for querying a knowledge base • Inference Web: for explaining and validating answers (descriptions or instances) • Web Services: for interfacing with vendors • Connections to online web agents/information services • Utilities for conducting and caching the above transactions

Inference Web in KANI Context

Knowledge Provenance Multiple Sources Answer Source Source

Chimaera: Trend: Ontology Environment • An interactive web-based tool aimed at supporting: • Ontology analysis (correctness, completeness, style, …) • Merging of ontological terms from varied sources • Maintaining ontologies over time • Validation of input • Features: multiple I/O languages, loading and merging into multiple namespaces, collaborative distributed environment support, integrated browsing/editing environment, extensible diagnostic rule language • Used in commercial and academic environments, basis of some commercial re-implementations (Ontobuilder/Ontoserver, Sandpiper, CISCO MDF, GM…) • Available as a hosted service from www-ksl-svc.stanford.edu • Information:www.ksl.stanford.edu/software/chimaera • More features required: versioning, diffs, ….

FindUR Architecture Content to Search: CLASSIC Knowledge Representation System Research Site Technical Memorandum Calendars (Summit 2005, Research) Yellow Pages (Directory Westfield) Newspapers (Leader) Internal Sites (Rapid Prototyping) AT&T Solutions Worldnet Customer Care Content (Web Pages or Databases Content Classification Domain Knowledge Domain Knowledge Search Technology: Search Engine Verity (and topic sets) GUI supporting browsing and selection Collaborative Topic Set Tool User Interface: Verity SearchScript, Javascript, HTML, CGI, CLASSIC Results (standard format) Results (domain specific) McGuinness – NPUC 8/4/2004

Trend: Semantic Search McGuinness – NPUC 8/4/2004

McGuinness – NPUC 8/4/2004

Inferences drawn by Information Extraction Document: CIA Report 117 Document: FBI Report 282 AER: (Person “Mr. Ramazi”) AER: (Org “Select Gourmet Foods”) AER: (Person “Abdul Ramazi”) BER: (Org “Select Gourmet Foods”) BER: (Person “Abdul Ramazi”) BER: (Org “SGF”) BRR: (hasOwner “SGF”, “Abdul Ramazi”) ARR: (hasOwner “Select Gourmet Foods”, “Mr. Ramazi”) MCR: (equals “Abdul Ramazi”, “Mr. Ramazi”, AbdulRamazi) MCR: (equals “Select Gourmet Foods”, “SGF”, SelectGourmetFoods) MCR: (hasOwner SelectGourmetFoods, AbdulRamazi)

Infrastructure: Core IWBase Statistics for relevant domain independent meta-data: Inference Engines 29 Axioms 56 Declarative Rules 38 select Method Rules 10 Derived Rules 6 Languages 12 select

Explaining Answers: GUI Explainer Users can exit the explainer providing feedback about their satisfiability with explanation(s) Select action Users can ask for alternative explanations or summaries

Follow-up: PML Abstraction(Techies only)

Knowledge Provenance Multiple Sources Answer Source Source

Trusting Answers Text Analysis Applications: Inference Web Approach

Trusting Answers Text Analysis Applications: Inference Web Approach

Presentation Transcript

A Conceptual Approach to Survival Analysis

Applications of Discrete Structures

Inference in First-Order Logic

Inference for Distributions - for the Mean of a Population

Sensitivity Analysis: Quantifying the Discourse About Causal Inference Kenneth A. Frank Help from Yun-jia Lo and Mik

Statistical Inference and Regression Analysis: Stat-GB.3302.30, Stat-UB.0015.01

Statistical Inference and Regression Analysis: Stat-GB.3302.30, Stat-UB.0015.01

EXPLORATORY FACTOR ANALYSIS (EFA)

3.RL.1

此报告仅供客户内部使用。未经麦肯锡公司的书面许可，其它任何机构不得擅自传阅、引用或复制。

Text vs. Subtext

TEXT ANALYSIS FOR SEMANTIC COMPUTING

Structural Analysis in Large Networks Observations and Applications

Text Structure

Text Analysis and Ontologies

Example text Go ahead and replace it with your own text. This is an example text.

Choose to Be Alone on Purpose

Research into Population Genetics

How to Prepare for Earthquakes

Text-main1