460 likes | 486 Views
Explore the ontology spectrum, from weak models like taxonomies to strong models like logical theories, for improved semantic interoperability and machine interpretability. Learn about the range of semantic models and migration paths.
E N D
Dr. Leo Obrst MITRE Information Semantics Group Information Discovery & Understanding Center for Innovative Computing & Informatics November 28, 2006 The Ontology Spectrum & Semantic Models
Agenda • Semantic Models: What & How to Decide? • Tightness of Coupling & Semantic Explicitness • Ontology and Ontologies • The Ontology Spectrum • Preliminary Concepts • Taxonomies • Thesauri • Conceptual Models: Weak Ontologies • Logical Theories: Strong Ontologies • Upper, Middle, Domain Ontologies • More: Logic Spectrum
Performance = k / Integration_Flexibility Proof, Rules, Modal Policies: SWRL, FOL+ Internet Semantic Mappings Semantic Brokers Agent Programming OWL-S Peer-to-peer Enterprise Ontologies RDF/S, OWL Web Services: SOAP Web Services: UDDI, WSDL Community Applets, Java XML, XML Schema Application N-Tier Architecture Workflow Ontologies Same Intranet Enterprise Middleware Web Conceptual Models Same Wide Area Network Client-Server Data Marts Data Warehouses Same Local Area Network Distributed Systems OOP Systems of Systems Federated DBs Same OS Same DBMS Same Address Space Same CPU Linking From Synchronous Interaction to Asynchronous Communication Same Programming Language Compiling Same Process Space 1 System: Small Set of Developers Tightness of Coupling & Semantic Explicitness Explicit, Loose Far EA Ontologies EA Brokers EA Semantics Explicitness Data SOA EAI Local Looseness of Coupling Implicit, TIGHT
Ontology Spectrum: The Range of Semantic Models & a Migration Path strong semantics Modal Logic First Order Logic Logical Theory Is Disjoint Subclass of with transitivity property Description Logic DAML+OIL, OWL From less to more expressive UML Conceptual Model Is Subclass of Semantic Interoperability RDF/S XTM Extended ER Thesaurus Has Narrower Meaning Than ER Structural Interoperability DB Schemas, XML Schema Taxonomy Is Sub-Classification of Relational Model, XML Syntactic Interoperability weak semantics
Problem: General Semantic Expressivity: High Problem: Local Semantic Expressivity: Low Problem: Very General Semantic Expressivity: Very High Problem: General Semantic Expressivity: Medium Ontology Spectrum: The Range of Semantic Models & a Migration Path strong semantics Modal Logic First Order Logic Logical Theory Is Disjoint Subclass of with transitivity property Description Logic DAML+OIL, OWL From less to more expressive UML Conceptual Model Is Subclass of Semantic Interoperability RDF/S XTM Extended ER Thesaurus Has Narrower Meaning Than ER Structural Interoperability DB Schemas, XML Schema Taxonomy Is Sub-Classification of Relational Model, XML Syntactic Interoperability weak semantics
Ontology Spectrum: Application Concept- based Ontology strong Logical Theory weak Conceptual Model Term- based Thesaurus Expressivity Taxonomy Enterprise Modeling (system, service, data), Question-Answering (Improved Precision), Querying, SW Services Synonyms, Enhanced Search (Improved Recall) & Navigation, Cross Indexing Real World Domain Modeling, Semantic Search (using concepts, properties, relations, rules), Machine Interpretability (M2M, M2H semantic interoperability), Automated Reasoning, SW Services Categorization, Simple Search & Navigation, Simple Indexing Application
Triangle of Signification Intension <Joe_ Montana > Concepts Semantics: Meaning Reference/ Denotation Sense Real (& Possible) World Referents Terms “Joe” + “Montana” Syntax: Symbols Pragmatics: Use Extension
Concept Relations Term Relations Subclass of Narrower than Synonym Term vs. Concept • Term (terminology): • Natural language words or phrases that act as indices to the underlying meaning, i.e., the concept (or composition of concepts) • The syntax (e.g., string) that stands in for or is used to indicate the semantics (meaning) • Concept: • A unit of semantics (meaning), the node (entity) or link (relation) in the mental or knowledge representation model Concept Vehicle Term “Vehicle” Concept Ground_Vehicle Concept Automobile Term “Automobile” Term “Car”
Data Objects Classification Objects Terminology Objects Meaning Objects XML DTD Data Schema XML Schema Thesaurus Ontology Keyword List Term (can be multi-lingual) Data Attribute Data Element Data Value Documents Conceptual Model Taxonomy Instance Value Attribute Property Relation Privileged TaxonomicRelation Namespace Class Concept Example: Metadata Registry/Repository – Contains Objects + Classification
Taxonomy: Definition • Taxonomy: • A way of classifying or categorizing a set of things, i.e., a classification in the form of a hierarchy (tree) • The classification of information entities in the form of a hierarchy (tree), according to the presumed relationships of the real world entities which they represent • A taxonomy is a semantic (term or concept) hierarchy in which information entities are related by either: • The subclassification ofrelation (weak taxonomies) or • The subclass ofrelation (strong taxonomies) for concepts or the narrower than relation (thesauri) for terms • Only the subclass/narrower than relation is a generalization-specializationrelation (subsumption)
Taxonomies: Weak Example: Your Folder/Directory Structure • No consistent semantics for parent-child relationship: arbitrary Subclassification Relation • NOT ageneralization / specializationtaxonomy Example: UNSPSC
Taxonomies: Strong • Consistent semantics for parent-child relationship:Narrower than (terms) or Subclass (concepts) Relation • A generalization/specialization taxonomy • For concepts: Each information entity is distinguished by a property of the entity that makes it unique as a subclass of its parent entity (a synonym for property is attribute or quality) • For terms: each child term implicitlyrefers to a concept which is the subset of the concept referred to by its parent term HAMMER Claw Ball Peen Sledge • What are the distinguishing properties between these three hammers? • Form (physical property) • Function (functional property) • “Purpose proposes property” (form follows function) – for human artifacts, at least
animate object agent person organization employee manager Subclass of Two Examples of Strong TaxonomiesMany representations of trees Simple HR Taxonomy Linnaeus Biological Taxonomy
When is a Taxonomy enough? • Weak taxonomy: • When you want semantically arbitrary parent-child term or concept relations, when the subclassification relation is enough • I.e., sometimes you just want users to navigate down a hierarchy for your specific purposes, e.g, a quasi-menu system where you want them to see locally (low in the taxonomy) what you had already displayed high in the taxonomy • Application-oriented taxonomies are like this • Then, in general, you are using weak term relations because the nodes are not really meant to be concepts, but only words or phrases that will be significant to the user or you as a classification devise • Strong taxonomy: • When you really want to use the semantically consistent narrower-than (terms) or subclass (concepts) relation (a true subsumption or subset relation) • When you want to partition your general conceptual space • When you want individual conceptual buckets • Note: the subclass relation only applies to concepts; it is not equivalent (but is similar) to the narrower-than relation that applies to terms in thesauri • You need more than a taxonomy if you need to either: • Using narrower than relation: Define term synonyms and cross-references to other associated terms, or • Using subclass relation: Define properties, attributes and values, relations, constraints, rules, on concepts
Thesaurus: Definition • From ANSI INISO 239.19-1993, (Revision of 239.194980): • A thesaurus is a controlled vocabulary arranged in a known order and structured so that equivalence, homographic, hierarchical, and associative relationships among terms are displayed clearly and identified by standardized relationship indicators • The primary purposes of a thesaurusareto facilitate retrieval of documents and to achieve consistency in the indexing of written or otherwise recorded documents and other items • A consistent semantics for the hierarchical parent-child relationship: broader than, narrower than, i.e., generalization/specialization • A thesaurus is a term taxonomy • Unlike Strong subclass-based Taxonomy, Conceptual Model, & Logical Theory: the relation is between Terms, NOT Concepts
Narrower than Related to Center For Army Lessons Learned (CALL) Thesaurus Example imagery aerial imagery infrared imagery radar imagery combat support equipment radar photography moving target indicators intelligence and electronic warfare equipment imaging systems imaging radar infrared imaging systems
When is a Thesaurus enough? • When you don’t need to define the concepts of your model, but only the terms that refer to those concepts, i.e., to at least partially index those concepts • Ok, what does that mean? • If you need an ordered list of terms and their synonyms and loose connections to other terms (cross-references) • Examples: • If you need to use term buckets (sets or subsets) to use for term expansion in a keyword-based search engine • If you need a term classification index for a registry/repository, to guarantee uniqueness of terms and synonyms within a Community of Interest or namespace that might point to/index a concept node • You need more than a thesaurus if you need to define properties, attributes and values, relations, constraints, rules, on concepts • You need either a conceptual model (weak ontology) or a logical theory (strong ontology)
Conceptual Models: Weak Ontologies • Most conceptual domains cannot be expressed adequately with a taxonomy • Nor with a thesaurus, which models term relationships, as opposed to concept relationships • Conceptual models seek to model a portion of a domain for a database or a system • UML is paradigmatic modeling language • Drawbacks: • Models mostly used for documentation, required human semantic interpretation • Limited machine usability because cannot directly interpret semantically • Primary reason: there is no Logic that UML is based on • You need more than a Conceptual Model if you need machine-interpretability (more than machine-processing) • You need a logical theory (high-end ontology)
Conceptual Model: UML Example Human Resource ConceptualModel
Logical Theories: Strong Ontologies • Emphasize Real World Semantics • Frame-based: • Node-and-link structured in languages which hide the logical expressions • Entity-centric, like object-oriented modeling • Centered on the entity class, its attributes, properties, relations/associations, and constraints/rules • Axiomatic: • Expressed as logical expressions • Non-entity-centric, focus on predicates, relations, properties • Enables automated inference
Language L Models M(L) Ontology Intended models IM(L) Logical Theories: More Formally Conceptualization C * N. Guarino. 1998. Formal ontology in information systems, pp. 3-15. In Formal Ontology in Information Systems, N. Guarino, ed., Amsterdam: IOS Press. Proceedings of the First International Conference (FOIS’98), June 6-8, Trent, Italy. p. 7
Axioms, Inference Rules, Theorems, Theory Theory (1) Theorems are licensed by a valid proof using inference rules such as Modus Ponens (2) Theorems proven to be true can be added back in, to be acted on subsequently like axioms by inference rules Theorems Axioms (3) Possible other theorems (as yet unproven) (4) Ever expanding theory
Axioms Inference Rules Theorems Class(Thing) Class(Person) Class(Parent) Class(Child) If SubClass(X, Y) then X is a subset of Y. This also means that if A is a member of Class(X), then A is a member of Class(Y) SubClass(Person, Thing) SubClass(Parent, Person) SubClass(Child, Person) ParentOf(Parent, Child) NameOf(Person, String) AgeOf(Person, Integer) If X is a member of Class (Parent) and Y is a member of Class(Child), then (X Y) And-introduction: given P, Q, it is valid to infer P Q. Or-introduction: given P, it is valid to infer P Q. And-elimination: given P Q, it is valid to infer P. Excluded middle: P P (i.e., either something is true or its negation is true) Modus Ponens: given P Q, P, it is valid to infer Q If P Q are true, then so is P Q. If X is a member of Class(Parent), then X is a member of Class(Person). If X is a member of Class(Child), then X is a member of Class(Person). If X is a member of Class(Child), then NameOf(X, Y) and Y is a String. If Person(JohnSmith), then ParentOf(JohnSmith, JohnSmith).
Ontology Representation Levels Language Meta-Level to Object-Level Ontology (General) Meta-Level to Object-Level Knowledge Base (Particular)
(implies (isa ?BATTALION InfantryBattalion) (thereExistExactly 1 ?COMPANY (and (isa ?COMPANY Company-UnitDesignation) (isa ?COMPANY WeaponsUnit-MilitarySpecialty) (subOrgs-Direct ?BATTALION ?COMPANY) (subOrgs-Command ?BATTALION ?COMPANY)))) CYC MELD Expression Example Ontology/KRExpressible as Language and Graph • In ontology and knowledge bases, nodes are predicate, rule, variable, constant symbols, hence graph-based indexing, viewing • Links are connections between these symbols: Semantic Net! isa ?BATTALION implies InfantryBattalion thereExistExactly 1 1 and ?COMPANY isa ?COMPANY What’s important is the logic! Company-UnitDesignation isa WeaponsUnit-MilitarySpecialty) subOrgs-Direct subOrgs-Command
Ontology Aircraft Identifier Signature Location Time Observed Service Identifier Signature Location Time Observed … Tid Type Long Lat T stamp … Navy 330296 F-14D 121°8'6" 2.35 S-code Model Coord Sense Time … Army CNM023 MIG-29 121.135° 13458 CNM023 MIG-29 121.135° 13458 330296 F-14D 121°8'6" 2.35 Navy 330298 AH-1G C 121°2‘2" 2.45 Tupolev TU154 CNM035 121.25° 13465 Army CNM035 Tupolev TU154 121.25° 13465 330298 AH-1G C 121°2‘2" 2.45 Sexigesimal Decimal Navy UTM Coordinate Commander, S2, S3 Army Geographic Coordinates A Military Example of Ontology Ontology: defines the terms used to describe and represent an area of knowledge (subject matter): vocabulary + meaning + machine understandable Axiomatized
Upper, Middle, Domain Ontologies But Also These! Most General Thing Identity Time Upper Ontology (Generic Common Knowledge) Part Space Processes Material Locations People Organizations Middle Ontology (Domain-spanning Knowledge) Facilities Terrorist Lower Ontology (individual domains) Terrorist Org Financier Jihadist Terrorist Lowest Ontology (sub-domains) Al Queda Areas of Interest
Summary of Ontology Spectrum: Scope, KR Construct, Parent-Child Relation, Processing Capability Ontology Spectrum Processing Scope Parent-Child Relation KR Construct Machine-readable Concept Term Machine-processible Sub-classification of Machine-interpretable Taxonomy SubClass of Narrower Than Thesaurus Strong Taxonomy Ontology Disjoint SubClass of with Transitivity, etc. Conceptual Model (weak ontology) Weak Taxonomy Logical Theory (strong ontology)
Thanks! Questions? lobrst@mitre.org
Ontology & Ontologies 1 • An ontology defines the terms used to describe and represent an area of knowledge (subject matter) • An ontology also is the model (set of concepts) for the meaning of those terms • An ontology thus defines the vocabulary and the meaning of that vocabulary • Ontologies are used by people, databases, and applications that need to share domain information • Domain: a specific subject area or area of knowledge, like medicine, tool manufacturing, real estate, automobile repair, financial management, etc. • Ontologies include computer-usable definitions of basic concepts in the domain and the relationships among them • They encode domain knowledge (modular) • Knowledge that spans domains (composable) • Make knowledge available (reusable)
Ontology & Ontologies 2 • The term ontologyhas been used to describe models with different degrees of structure (Ontology Spectrum) • Less structure:Taxonomies (Semio/Convera taxonomies, Yahoo hierarchy, biological taxonomy, UNSPSC), Database Schemas (many) and metadata schemes (ICML, ebXML, WSDL) • More Structure:Thesauri (WordNet, CALL, DTIC), Conceptual Models (OO models, UML) • Most Structure:Logical Theories (Ontolingua, TOVE, CYC, Semantic Web) • Ontologies are usually expressed in a logic-based language • Enabling detailed, sound, meaningful distinctions to be made among the classes, properties, & relations • More expressive meaning but maintain “computability” • Using ontologies, tomorrow's applications can be "intelligent” • Work at the human conceptual level • Ontologies are usually developed using special tools that can model rich semantics
Root Directed Acyclic Graph Node Directed Edge Directed Cyclic Graph Tree vs. Graph Tree
Thesaurus vs. Ontology Controlled Vocabulary Ontology Terms: Metal working machinery, equipment and supplies, metal-cutting machinery, metal-turning equipment, metal-milling equipment, milling insert, turning insert, etc. Relations: use, used-for, broader-term, narrower-term, related-term Concepts Logical-Conceptual Semantics (Strong) Thesaurus Real (& Possible) World Referents Terms Term Semantics (Weak) • ‘Semantic’ Relations: • Equivalent = • Used For (Synonym) UF • Broader Term BT • Narrower Term NT • Related Term RT Logical Concepts Entities: Metal working machinery, equipment and supplies, metal-cutting machinery, metal-turning equipment, metal-milling equipment, milling insert, turning insert, etc. Relations: subclass-of; instance-of; part-of; has-geometry; performs, used-on;etc. Properties: geometry; material; length; operation; UN/SPSC-code; ISO-code; etc. Values: 1; 2; 3; “2.5 inches”; “85-degree-diamond”; “231716”; “boring”; “drilling”; etc. Axioms/Rules:If milling-insert(X) & operation(Y) & material(Z)=HG_Steel & performs(X, Y, Z), then has-geometry(X, 85-degree-diamond). • Semantic Relations: • Subclass Of • Part Of • Arbitrary Relations • Meta-Properties on Relations
Conceptualization B: Buyer Conceptualization S: Seller Conceptualization B1: Technical Buyer Conceptualization S1: Manufacturer Seller Conceptualization B2: Non-Technical Buyer Conceptualization S1: Distributor Seller Language LB1 Language LS1 Language LB2 Language LS2 Models MB1(LB1) Models MB2(LB2) Models MS2(LS2) Models MS1(LS1) Ontology Intended models IMB1(LB1) Intended models IMB1(LB1) Intended models IMB2(LB2) Intended models IMB1(LB1) A More Complex Picture (from E-Commerce)
Upper Ontological Distinctions 1 Focus here is on a few of the many possible upper ontological distinctions to be made • Descriptive vs. Revisionary: how one characterizes the ‘ontological stance’, i.e., what an ontological engineering product is or should be • Revisionary: every model construct (concept) is a temporal object, i.e., necessarily has temporal properties • Descriptive: model constructs are not necessarily temporal objects • Multiplicative vs. Reductionist: how one characterizes the kinds and number of concepts to be modeled • Multiplicative: Concepts can include anything that reality seems to require or any distinction that is useful to make • Reductionist: Concepts are reduced to the fewest primitives from which it is possible to generate complex reality
Upper Ontological Distinctions 2 • Universal vs. Particular: the kinds of entities that ontologies address (the ‘universe of discourse’(s) of the ontology) • Universals: generic entities, which can have instances; classes • Particulars: specific entities, which are instances and can have no instances themselves • Continuant vs. Occurrent • Continuant: An entity whose identity continues to be recognizable over some extended interval of time (Sowa, 2000) • Occurrent: An entity that does not have a stable identity during any interval of time (Sowa, 2000) • 3-dimensional (endurant) vs. 4-dimensional (perdurant) • 3D view/ Endurant: an object that goes through time (endures), with identity/essence-defining properties that perhaps depend on occurrent objects but are not essentially constituted by those occurrent objects • 4D view/ Perdurant: an object that persists (perdures) through spacetime by way of having different temporal parts at what would be different times
Upper Ontological Distinctions 3 • Part & Whole: Mereology, Topology, Mereotopology, the ‘part of’ relation • Mereology: parthood, what constitutes a ‘part’? • Topology: connectedness among objects, what constitutes ‘connected to’? • Mereotopology: the typical contemporary analysis of ‘part of’ says that the relation requires both the notion of part and the notion of connectedness; neither is sufficient alone to describe what we mean by saying that something is a part of another thing
Ontology Spectrum strong semantics Logic Spectrum on Next Slide will cover this area Modal Logic First Order Logic Logical Theory Is Disjoint Subclass of with transitivity property Description Logic From less to more expressive DAML+OIL, OWL UML Conceptual Model Is Subclass of Semantic Interoperability RDF/S XTM Extended ER Thesaurus Has Narrower Meaning Than ER Structural Interoperability DB Schemas, XML Schema Taxonomy Is Sub-Classification of Relational Model, XML Syntactic Interoperability weak semantics
Logic Spectrum: Classical Logics: PL to HOL most expressive SOL + Complex Types + Higher-order Predicates (i.e., those that take one or more other predicates as arguments) Higher Order Logic (HOL) From less to more expressive Logics Second Order Logic (SOL) FOL + Quantifiers (, ) over Predicates Modal Predicate Logic (Quantified Modal Logic) FOL + Modal operators First-Order Logic (FOL): Predicate Logic, Predicate Calculus PL + Predicates + Functions + Individuals + Quantifiers (, ) over Individuals Logic Programming (Horn Clauses) Syntactic Restriction of FOL Decidable fragments of FOL: unary predicates (concepts) & binary relations (roles) [max 3 vars] Description Logics ModalPropositional Logic PL + Modal operators (, ): necessity/possibility, obligatory/permitted, future/past, etc. Axiomatic systems: K, D, T, B, S4, S5 Propositional Logic (PL) Substructural Logics: focus on structural rules Propositions (True/False) + Logical Connectives (, , , , ) less expressive
Logic Spectrum: Semantic Web Languages: Ontologies & Rules most expressive Higher Order Logic (HOL) From less to more expressive Logics Second Order Logic (SOL) Modal Predicate Logic (Quantified Modal Logic) SOL extensions First-Order Logic (FOL): Predicate Logic, Predicate Calculus OWL-FOL SWRL OWL + Horn-like Rules Logic Programming (Horn Clauses) OWL Full Almost FOL, but Classes as Instances goes to SOL OWL DL Mostly SHOIN(D): Close to the SHIQ and SHOQ Description Logics OWL Lite Almost SHIF(D) (technically, it’s a variant of SHIN(D) ModalPropositional Logic RDF/S Positive existential subset of FOL: no negation, no universal quantification Propositional Logic (PL) Linear Logic: consume antecedents Substructural Logics: focus on structural rules RuleML less expressive Expressed syntactically in XML, requires binding to a logic, ranges over all logics
Logic Spectrum: Other KR Languages, Query Languages most expressive Higher Order Logic (HOL) From less to more expressive Logics Second Order Logic (SOL) Modal Predicate Logic (Quantified Modal Logic) SOL extensions First-Order Logic (FOL): Predicate Logic, Predicate Calculus Knowledge Interchange Format (KIF), Common Logic (CL, SCL) CycL Constraint Logic Programming languages Logic Programming (Horn Clauses) OWL-QL Open Knowledge Base Connectivity Language (OKBC) Description Logics Datalog RDQL SPARQL XQuery XPath ModalPropositional Logic SQL Propositional Logic (PL) Linear Logic: consume antecedents Substructural Logics: focus on structural rules less expressive
Logic Spectrum: Tools most expressive Higher Order Logic (HOL) HOL From less to more expressive Logics Second Order Logic (SOL) Modal Predicate Logic (Quantified Modal Logic) Vampire OntologyWorks Otter SNARK SOL extensions First-Order Logic (FOL): Predicate Logic, Predicate Calculus Ontolingua/Chimaera Knowledge Interchange Format (KIF), Common Logic (CL, SCL) Cyc CycL Constraint Logic Tools: ECLIPSE, etc. Constraint Logic Programming languages Logic Programming (Horn Clauses) OWL-QL Prologs: Amzi!, XSB, SWI, Ciao, BinProlog, Quintus, Sextus Open Knowledge Base Connectivity Language (OKBC) Cerebra, Jena, L&C’s LinkFactory, KAON2, Racer, FaCT, Swoop, Pellet Description Logics Datalog RDQL Protégé XQuery XPath ModalPropositional Logic CLIPS, JESS SQL Propositional Logic (PL) Linear Logic: consume antecedents Substructural Logics: focus on structural rules less expressive
What do we want the future to be? • 2100 A.D: models, models, models • There are no human-programmed programming languages • There are only Models Transformations, Compilations INFRASTRUCTURE Ontological Models Knowledge Models Belief Models Application Models Presentation Models Target Platform Models Executable Code