330 likes | 439 Views
Knowledge Base Diagnostics Richard Fikes (Stanford KSL) Adam Pease (Teknowledge) Mala Mehrotra (Pragati Synergetic Research Inc.) Yolanda Gil (USC ISI) Deborah McGuinness (Stanford KSL). 10/18/01. Knowledge Evolution Tools. KB development requires knowledge evolution
E N D
Knowledge Base Diagnostics Richard Fikes (Stanford KSL) Adam Pease (Teknowledge) Mala Mehrotra (Pragati Synergetic Research Inc.) Yolanda Gil (USC ISI) Deborah McGuinness (Stanford KSL) 10/18/01
Knowledge Evolution Tools • KB development requires knowledge evolution Debugging, refining, structuring, modularizing, … • Power tools are needed to support KB evolution • KB diagnosis • Bugs, omissions, heuristic warnings, architectural advice • KB partitioning • To enable effective reasoning • To produce reusable KB building blocks • KB merging • To enable interoperation of KBs with overlapping content • KSL is developing knowledge evolution tools
Chimaera • A Knowledge Evolution Tool Environment • Tools for KB diagnosis and merging • Available as a Web service or an OKBC client • www.ksl.stanford.edu/software/chimaera • Usable from a Web browser • Online user manual, tutorial, and demonstration movie • Performs KB diagnostics in batch mode • Uploads and analyzes user’s KB • Accepts KBs in OKBC, KIF, MELD, RDF, DAML, … • Provides results as HTML pages linked to frames and axioms • Provides user selectable set of diagnostic tests • Analyzes both the structure and content of a KB • Uses reasoners to analyze content
Classification of Diagnostic Results • Errors • Logical inconsistencies E.g., contradictory type constraints • Content structure errors E.g., terms used but not defined • Anomalies • Missing information E.g., type constraints • Redundancies E.g., redundant superclass and type links • Extraneous structure or content E.g., terms defined but not used • Summaries E.g., counts of term references • Suggestions E.g., use consistent naming conventions
“Background” Reasoning Analysis • Reasoning diagnostics that may take substantial time • Performed in background • Results incrementally posted on Web page • Completion notification sent to user via e-mail • Example reasoning diagnostics • Redundant axioms that are inferred by the KB (anomaly) • Inconsistent axioms whose negations are inferred by the KB (error) • Determine which relations in KB are primitive and non-primitive (summary) • Show relations on which each non-primitive relation depend • Determine classes that are disjoint (suggest adding results to KB) • Derive subclass and instance links (suggest adding links to KB) I.e., classification and recognition • Suggest reordering of an implication’s antecedents based on number of inferable instances of each antecedent (suggestion)
Integration Into SHAKEN • Chimaera is a KB diagnostics tool in the SHAKEN system • Used to diagnose both pump priming and SME KBs • OKBC was used to do the integration • Chimaera is an OKBC client • Interacts with any OKBC server using the OKBC API • The Chimaera Web service uses Ontolingua as its OKBC server • SRI added an OKBC wrapper to the KM system • Enabled KM to be an OKBC server usable by OKBC clients • Enabled Chimaera’s diagnostics to run directly on KM KBs
Chimaera Useful To SRI Team “Overall, we found that Chimaera was quite useful. It found 2 concepts (Indole and Imidazole) that were corrupted, several occurrences of redundant superclasses, and several incorrect domain and range constraints (due to our poor representation of "Information"). … We're currently fixing the bugs it revealed. It would be helpful if we could run Chimera on the component library frequently.” – Bruce Porter
Next Steps: SME-Oriented Support • Provide interactive repair oriented follow-up to diagnostics • Identify KB content on which diagnosis result is based • Suggest repairs or repair strategies • Guide user through repair procedure • Examples • Class is a direct subclass of “THING” • Provide direct subclasses of THING as candidate superclasses • Step down through the class hierarchy • Class has redundant superclass links • Suggest removal of link(s) to most general classes • Type, cardinality, or bounds conflict • Suggest changing local conflicting constraint(s) • Missing information • Initiate acquisition dialogues for missing information
Next Steps: Architectural Analysis • Summarize architectural features of a KB • Percentage of • Relations that are functions • Axioms that are propositional, first order, higher order • Axioms that are not horn clauses • Distribution of • Axioms by type (using the HPKB, RKF types) • Axiom lengths by number of literals • Functions by number of arguments • Relations by number of arguments • Direct subclasses per class • Direct subproperties per property • Restrictions per object • Property values per object
Next Steps: Partitioning and Beyond • Integration of KB partitioning tools into Chimaera • Provide automatic KB partitioning to enhance usability • Automatic running of test cases E.g., queries and expected answers • Support regression testing of evolving KB • Provide result summaries from failed tests • Help with typographical errors • Spelling correction for undefined names E.g., classes, slots, relations, functions, constants • Spelling correction for anomalously occurring variables • Suggest is the same as another variable in the sentence
Summary • KSL is developing Chimaera to support KB evolution • Chimaera was integrated into the SHAKEN Y1 system Using OKBC(!) • Incrementally adding diagnostics E.g., “background” diagnostics that use sophisticated reasoning • Next steps • KB partitioning tools • Repair dialogues for SMEs • KB architectural analysis • Regression testing
Role of Diagnostics in Systems • KE support • SME support • Increase productivity (“lightly trained”) • Step in managing KB development • Focus attention (e.g., redundant links) • Evaluation support • Diagnose KBs produced during evaluation • Batch mode • Foreground • Background • Changes in “patterns” in the KB between versions
Sharing Diagnostics Information • Diagnostic specifications • Logical specifications • English specifications • Test cases • Diagnostic classifications • Learnings • Tricks of the trade • Sharing facilitators: • Working group • Mailing list • Findings data • Author, group, or team specific • Repair strategies • Alignments during collaborative development
Developer Needs and Desires • Reasoner-specific diagnostics • Highly informative diagnostic results • Reporting architectural bias in a KB • Binary versus higher order relations • First order versus higher order axioms • Weakly versus strongly higher order • Disjunctions or conjunctions • Existential versus universal quantifiers • Frames to axioms ratios • Horn clauses • Axiom lengths • Functions • Confusion of existential and universal quantifiers • Type restrictions too general • Misspelling of variables
Developer Needs and Desires • Domain-specific tests • Semantic tests • Maintainability measures • Recognizing typographical errors • Spell check undefined or unused terms • Redefining (e.g., breaking up) a predicate • Large scale modification techniques • Prioritizing diagnostics
Integration Issues • Architecture • Use hosted services (like KSL) • Integrate special code • Take specifications from library • API • Interaction Mode - Batch versus Interactive/Repair • Translation issues • One major use of diagnostics is also in testing translators • Certain translations need to be done to do better analysis • Output integration
Evaluation • Record types and numbers of errors • Comparing KBs produced by SMEs versus KEs • Record use of repair strategies • Evaluate during testing • Feedback from SMEs about diagnostics
Classification of Diagnostic Results • Errors • Logical inconsistencies • Content structure errors • (See Randy Davis thesis) • Anomalies • Missing information • Missing portions of descriptions • Redundancies • Extraneous structure or content • Summaries • Architectural biases • Suggestions • Stylistic suggestions • Static versus operational tests • Use of expertise about KR paradigms
Diagnostic Issues/Goals • Role of Diagnostics in Systems • KE support, SME support • Evaluators of KBs • How to Share Diagnostics • Working Group? • Logical specification, English descriptions, tests, … • Know the Main Contributors • Possible Diagnostics • What do users want? • What can tool builders provide? • Integration Issues • Developer Needs/Desires • Evaluation
The Role of KB Diagnostics • KE support • SME support • Increase productivity (“lightly trained”) • Mgmt of kb • Inference dependent quality improvement • Focus attention (ex. Redundant links) • Evaluation support • Abstract patterns – average fanout of specialization, statistics of number of uses of a predicate – big picture view • Version comparison • Regression testing
Diagnostic Sharing • Diagnostic specifications • Logical specifications • English specifications • Test cases • Diagnostic classifications • Taxonomy of errors – bottlenecks, • Quantification • Alignments across systems – inconsistencies among smes • Repair strategies • How informative a system is (core dump vs. useful explanation) • Learnings • Tricks of the trade • Sharing Facilitators: • Working Group • Mailing list
Sharing facilities • Working group • Mailing list • Posting of papers • Utilize Teknowledge
biases • Binary vs. higher arity • First order vs higher order • Weakly vs strongly higher order • Universal over existential • Disjunction vs. conjunction • Frame-ism • Horn clauses • Lisp style • Relations -> functions • Depth vs. breadth in hierarchy • …. Maybe report in summarizations.. • At least document biases
Organizations/People • Cycorp – many special purpose - Kahlert • ISI – Why Not? – Chalupsky – KANAL – Gil - expect - Gil • Pragati – Clustering - Mehrotra • Stanford FRG/KSL – Partitioning – McCarthy, Amir, McIlraith • Stanford KSL – Chimaera - Fikes, McGuinness
Diagnostics • Errors – provable logical inconsistencies • Anomalies – redundancies, cycles,… • Summaries – word counts, … • Suggestions – naming conventions • Incompletenesses – explicit salient assertions or statistics • Stylistics - length of rule, … bad factoring, Randy davis – errors – incompleteness, inconsistent • Get this - Top ten list of things people do wrong in cyc - goolsbey Perspectives/units: Frame-like content vs. axioms vs. problem solving technology vs. learning to correct components
style • Static • Reasoner • Simulation / execution • Using examples • Summarization/improvements/critiquer
Integration Issues • Architecuture • Use hosted services (like KSL) • Integrate special code • Take specifications from library • API • Interaction Mode – Batch vs. Interactive/Repair • Translation issues • one major use of diagnostics is also in testing translators • Certain translations need to be done to do better analysis • Background ontologies – meld starter ontology • Output integration
Developer Needs/Desires Missing existentials Too high a type specification Variable name mismatch Semantic requests: Wrong semantic paradigm? Typos Spell check Large scale modification tools and their integration example removal/ fixing top level priotizing Diagnostics to minimize cost, ease maintenance
Evaluation • Record types of errors • Fine granularity • Kb differences across sme vs. ke developed ontologies across team • Record use of repair strategies… • Evaluate during testing… • Feedback from smes on features, usefulness, etc. • Attempt to keep extremely complete audit trails for future analysis • Important to be careful with diagnostic reporting
Action Items • Working Group • Diagnostics repository • Web site • Follow up briefing • Mailing list
Chimaera • A Knowledge Evolution Environment • Tools for KB diagnosis and merging • Available as a Web service • www-ksl-svc.stanford.edu www.ksl.stanford.edu/software/chimaera • Usable from a Web browser • Online user manual, tutorial, and demonstration movie • Provides user selectable set of diagnostic tests • Performs kb diagnostics in batch mode • Uploads and analyzes user’s KB • Accepts KBs in MELD, KIF, OKBC, DAML, RDF, XML, … • Provides results as HTML pages linked to frames and axioms • Analyzes both the structure and content of a KB • Uses hybrid reasoners to analyze content • Currently runs 28 diagnostic tests
Collection/Specification • Logical Specification of diagnostic • English Specification • Example kb that triggers diagnostic output
Classification of Diagnostic Results II • Axiom Analysis • Axiom Syntax Problems E.g., no consequent to a implications • Axiom Redundancy E.g., 1. A =>B 2. A=>C 3. C =>B means 1 is redundant • Axiom Variable Usage E.g., Variable used in antecedent but not in consequent • Axiom Consistency E.g., A => not A • Axiom Tautology E.g., consequent repeats (portion of) antecedent