580 likes | 830 Views
ONTACWG: Coordinating Knowledge Classifications. Patrick Cassidy MITRE Corporation* Presented at the ONTACWG Organization Meeting October 5, 2005 McLean, Virginia
E N D
ONTACWG: Coordinating Knowledge Classifications Patrick Cassidy MITRE Corporation* Presented at the ONTACWG Organization Meeting October 5, 2005 McLean, Virginia * NOTE: The author’s affiliation with The MITRE Corporation is provided for identification purposes only, and is not intended to convey or imply MITRE’s concurrence with, or support for, the positions, opinions or viewpoints expressed by the author.
ONTACWG Ontology and Taxonomy Coordinating Working Group A working group of the Semantic Interoperability Community of Practice (SICoP) To assist in the development and cross-referencing of Knowledge Classification Systems (Ontologies, taxonomies, thesauri, graphical knowledge representations) by: maintaining on-line resources where such efforts can share: data; utilities to help create such resources; and pilot programs to demonstrate how to use such knowledge classifications for practical purposes (2) To adopt and extend, as a community, a higher-level ontology that can serve as the “defining conceptual vocabulary” adequate to specify the meanings of the terms used within all of the participating communities, and relate the community terms to each other precisely.
Where Are We? • Many Taxonomies and Ontologies • Few Mappings of One to the Other • No Agreed Standard of Meaning
Where Do We Want To Go? • Powerful Search • Semantic Interoperability • Automatic Knowledge Extraction
How Do We get There? • Create Agreed Standard of Meaning: a Common Semantic Model • Use Existing Upper Ontology or adapt one for our own use • Define (map) terms in Existing Taxonomies and Ontologies by use of Common Defining Concepts
Why Is a Top-Level Ontology Needed? • To support semantic interoperability by serving as a Common Semantic Model, functioning as a common defining vocabulary, allowing systems developed in different locations to share their definitions and reason with each other’s data • To provide a well-tested inventory of basic concepts that can be combined to specify the meaning of domain-specific concepts in a form suitable for reasoning
What Does it Mean to “Specify the meaning of a term”? • “The biological mother of a person is a woman who has given birth to that person” • {{?Mother isTheBiologicalMotherOf ?Child} impliesThat (ThereExists {((exactly one) ?Event) and ((exactly one) ?Date)} suchThat {{?Event isa BirthEvent} and {?Event occurredOn ?Date} and {?Mother is (The Mother in ?Event)} and {?Child is (The Baby in ?Event)} and {(The BirthDate of ?Child) is ?Date}})}
The Integrating Function of the Common Semantic Model GenericObligation SameAs SameAs Obligation Duty
The Integrating Function of the Common Semantic Model –via Domain-level Mapping GenericObligation SameAs Obligation SameAs Duty
Taxonomy Mapping for Search • When a category in one taxonomy can be identified with a category in another taxonomy, the documents associated with each node are relevant to the other • When documents indexed by another taxonomy are not of interest to a local community, they can nevertheless be used to train an associative document classifier, which can find the documents in the community document collection that are relevant to that topic
ONTACWG for Search • ONTACWG might maintain, for each topic within a community KCS: • a set of sample documents that can be used to classify a local document collection by associative document-matching techniques • one or more sample queries that are known to find pages on the www relevant to the topic (possibly different for each search engine) • a list of www pages relevant to the topic
Taxonomy Mapping For Interoperability • Communities build and maintain their own terminologies and KCSs, using them in any way they wish for their own community purposes • When community members want their semantic information to interoperate with other domain knowledge, where logical inference is needed, they can use the mappings to the Common Semantic Model
Taxonomy Mapping for Natural Language Understanding • Language understanding requires recognition of the context in which linguistic statements are made • Maintaining a large public set of documents or document fragments illustrating particular topics can help natural language programs to recognize known textual contexts
The Long-Term Goal Semantic Interoperability: The ability of computers to accurately communicate conceptual information; to correctly interpret the meanings of communicated information and make appropriate decisions By adopting or building a common conceptual language for computers, which can be used to specify and relate the meanings of terms in any community terminology.
What A Common Semantic Model Is A means to allow computers to accurately communicate conceptual information – in effect, a common language for computers – Fo use when the users want to communicate
What A Common Semantic Model Isn’t • A controlled vocabularyEach community can choose its own words to refer to concepts • A mandated standardUsers can use any common ontology or none, as their own needs dictate
Communities and Controlled Vocabularies • Whenever a community of interest or community of practice is sufficiently homogeneous to agree on a controlled vocabulary, that vocabulary can serve as a linguistic signature of a particular context, which will be helpful in machine interpretation of text documents. • i.e., multiple controlled vocabularies are good things. The Common Semantic Model can specify the relations between terms in community vocabularies.
Concepts vs. Words • Mathematical • Theory • • / | \ • • / \ \ / • • | \ / \ • • | \ \ / • • | \ / • • Axioms: • (Every Cat has (( 4) Legs)) • (Every House has ((atLeast 1) Door)) Ontological Theory Terminology “House” “Residential House” “Haus” “maison” “дом” Cat House シャム猫 Siamese “Siamese” “Siamese feline” “Siamese Cat” “chat siamois” “Siamesische Katze”
Categorical Ambiguity can be represented as a union of categories • Metaphor • Poetry • Double entendre • Rhetoric • “Jack went fishing last weekend and caught three trout and a cold.” • Intentionally Ambiguous Word UseNot at issue in formal classification
Who Needs a Common Semantic Model? • Any computer system that needs to accurately communicate conceptual information needs a language in common with the receiving system "Money is being spent on labs and hiring smart people who make products do unnatural acts together.” Alan Shockley, manager of Enterprise Information Technology at EDS Estimated costs of lack of data interoperability nationwide is over 100B/yr
Will Any Upper Ontology Serve? Lenat’s Dictum (Building Large Knowledge-Based Systems, 1990, p. 20): • Do the top layers of the global ontology correctly • Relate all the rest of human knowledge to those top layers
OpenCyc SUMO DOLCE Omega (SENSUS) OCHRE BFO WordNet (?) MSO Will Any Upper Ontology Serve?Publicly Available Upper Ontologies: Comparison of Upper Ontologies: http://www.mitre.org/work/tech_papers/tech_papers_04/04_0603/04_1175.pdf European Initiative: WonderWeb New American Initiative: NCOR
A Merged Upper Ontology –One Possible Method • Merge the compatible elements of the Cyc, Omega, SUMO, MidLevel, and DOLCE, add Other concepts as desired by participants, and map this to Wordnet: • => COSMO • COmmon Semantic MOdel • or Cyc, Omega, Sumo, Midlevel, Other
COSMOThe Common Semantic Model • We need an inventory of logically defined higher-level concepts adequate to specify the meanings of the terms and concepts in all domain Knowledge Classification Systems used by participants. • Structured as a set of precisely interrelated ontologies without duplicated concepts and with a set of logically consistent default core concepts
How Many Defining Concepts? Clues: • LDOCE uses controlled defining vocabulary of ~ 2000 words • Japanese students learn ~1850 kanji • AMESLAN dictionary has ~5000 signs
When Do We Need a New Primitive Defining Concept? • If any of the content words in the natural-language definition have no corresponding concepts in the existing COSMO • If it is necessary to use a “disjoint” relation to distinguish a new concept from others in the ontology
Mid-Level Ontologies and Extensions • In addition to the primitive defining concepts, communication and mapping will be enhanced by maintaining mid-level ontologies and cross-domain ontology extensions in which the concepts are specified in terms of the defining conceptual vocabulary. • This reduces the chances that meanings intended to be identical will inadvertently differ due to differences in non-essential attributes
Requirements • Tools to make the COSMO easy to understand and easy to use • Tools to view and extract only those concepts of interest for a particular application • Pilot and Demonstration applications that illustrate the benefits of using the COSMO
TOOLS • KCS Building and Maintenance Tools • Protege http://protege.stanford.edu/ • UML http://www.uml.org/ • Concept Maps http://cmap.ihmc.us/ • Representation Formalisms • KIF/SKIF/ESKIF/Conceptual Graphs/SCL • OWL • OWL extensions (SWRL, RuleML, OWL-Flight, ?)
CONTROLLED ENGLISH • ClearTalk (Skuce, 1996) http://www.csi.uottawa.ca/~kavanagh/Ikarus/Cleartalk.html • Effective NL Paraphrasing of Ontologies on the Semantic Web http://www.mindswap.org/papers/nlpowl.pdf • Sowa’s “Common Logic Controlled English” http://www.jfsowa.com/clce/specs.htm • ESKIF (developmental)
Example of Problem without a COSMOClass: Wine <owl:Class rdf:ID="Wine"> <rdfs:subClassOf rdf:resource="&food;PotableLiquid" /> <rdfs:subClassOf> <owl:Restriction> <owl:onProperty rdf:resource="#locatedIn"/> <owl:someValuesFrom rdf:resource="&vin;Region"/> </owl:Restriction> </rdfs:subClassOf> <rdfs:label xml:lang="en">wine</rdfs:label> </owl:Class>
ObjectProperty: locatedInFrom wine.rdfhttp://www.w3.org/2001/sw/WebOnt/guide-src/wine.rdf <owl:ObjectProperty rdf:ID="locatedIn"> <rdf:type rdf:resource="&owl;TransitiveProperty" /> <rdfs:domain rdf:resource="http://www.w3.org/2002/07/owl#Thing" /> <rdfs:range rdf:resource="#Region" /> </owl:ObjectProperty> <Region rdf:ID="MedocRegion"> <locatedIn rdf:resource="#BordeauxRegion" /> </Region> <Region rdf:ID="BordeauxRegion"> <locatedIn rdf:resource="#FrenchRegion" /> </Region>
Medoc (Wine) • <owl:Class rdf:ID="Medoc"> • <rdfs:subClassOf> • <owl:Restriction> • <owl:onProperty rdf:resource="#hasColor" /> • <owl:hasValue rdf:resource="#Red" /> • </owl:Restriction> • </rdfs:subClassOf> • <rdfs:subClassOf> • <owl:Restriction> • <owl:onProperty rdf:resource="#hasSugar" /> • <owl:hasValue rdf:resource="#Dry" /> • </owl:Restriction> • </rdfs:subClassOf> • <owl:intersectionOf rdf:parseType="Collection"> • <owl:Class rdf:about="#Bordeaux" /> • <owl:Restriction> • <owl:onProperty rdf:resource="#locatedIn" /> • <owl:hasValue rdf:resource="#MedocRegion" /> • </owl:Restriction> • </owl:intersectionOf> • </owl:Class>
<owl:Class rdf:ID="Medoc"> <owl:equivalentClass> <owl:Class> <owl:intersectionOf rdf:parseType="Collection"> <owl:Class rdf:about="#Bordeaux" /> <owl:Restriction> <owl:onProperty rdf:resource="#locatedIn" /> <owl:hasValue rdf:resource="#MedocRegion" /> </owl:Restriction> </owl:intersectionOf> </owl:Class> </owl:equivalentClass> <rdfs:subClassOf> <owl:Restriction> <owl:onProperty rdf:resource="#hasColor" /> <owl:hasValue rdf:resource="#Red" /> </owl:Restriction> </rdfs:subClassOf> <rdfs:subClassOf> <owl:Restriction> <owl:onProperty rdf:resource="#hasSugar" /> <owl:hasValue rdf:resource="#Dry" /> </owl:Restriction> </rdfs:subClassOf> </owl:Class> Fig. 1. OWL Class ‘Medoc’ in the Wine Ontology Serialized in RDF/XML ‘Medoc is a sweet, red color wine located in the Medoc region.’
ESKIF Version • {{Medoc isaTypeOf Wine} and (Every Medoc is {Dry and RedColored and (ProducedIn (the MedocRegion))})} SKIF: • (isaSubclassOf Medoc Wine) • (necessarily Medoc hasAttribute Dry) • (necessarily Medoc hasAttribute RedColored) • (necessarily Medoc hasAttribute (ProducedIn MedocRegion)
ESKIF • Like SKIF, but statements in braces have first two arguments inverted • {ColonelMustard killed MissScarlet} ≡ (killed ColonelMustard MissScarlet) {{ColonelMustard killed MissScarlet}, (in (the Conservatory)) (with (A Knife))} {(The Person named “Albert Einstein”) proposed (The Theory called “The Theory of Relativity”)}
Basic Components of An Ontology Hierarchy of Classes Semantic Relations (slots/associations) Instances Functions Axioms Procedural Methods
Handling Different Perspectives • It is widely recognized that different communities are interested in different aspects of the same entities • These can be represented in a logically consistent manner by allowing dynamic creation of classes with only some of the known attributes and relations of the physically realistic class • This corresponds to the use of anonymous classes in an OWL restriction
Different Interests How big is the diamond? How much does it cost?
Flexible View Creation Entity SelectiveView DetailedEntity isaSubViewOf PricedObject DiamondRing
Topic Taxonomies vs. Inheritance Taxonomies • Topic classifications (as in library systems) may be thesauri, or conceptually “part-of” classifications, in which a more general topic has multiple included topics. e.g. “Birds” might include “ornithologist” or “birdwatching” as well as “avocets” • Useful for browsing search in document collections
Topic Taxonomies vs. Inheritance Taxonomies • Inheritance taxonomies are organized by the “is a subclass of” relation, in which the instances of the subclass have all of the characteristics of the parent class, plus others. • e.g. “airplane” would subsume “jet airplane” • useful for logical inference on formal knowledge
DOLCE relations on description: 3 130 total relations, most inherited Most applications will use only a small fraction Most applications will use only the SALIENT relations
Needed: Tools to ProvideCommunity Views of Knowledge Held in Common • An integrated comprehensive knowledge base is not necessarily inconsistent with local community views • But tools are needed to make extracting, viewing, and modifying such views easy
Knowledge: Search and Deploy Community Goals Action enables Retrieved Knowledge dictate • Comprehensive • Formalized • Knowledge • • / | \ • • / \ \ / • • | \ / \ • • | \ \ / • | \ / • • • • • Document Collection provides provides Automated Reasoning Community Knowledge Needs Keyword Search Graphic Search Browsing Search Human Friendly Machine Friendly Community Graphical View TS Community Taxonomy or Thesaurus
Registries for KOSs • A registry will provide information to allow the public to determine whether a KOS is suitable for their purposes – metadata about the KOS. • A registry that can describe the relations between KOS systems (dependency, similarity) requires special types of metadata.
Special KOS registry requirement • In order to be reusable outside the originating community, a KOS should have information specifying whether the meanings of its terms depend on any other KOS, or are related to terms in any other KOS. • In the event that an upper ontology is used to specify meanings in a KOS, that needs to be explicitly represented. • If an ontology is intended to be independent and self-describing, that needs to be specified.