510 likes | 540 Views
Ontology Engineering & Maintenance. Semantic Web - Fall 2005 Computer Engineering Department Sharif University of Technology. Outline. Ontology Engineering Ontology Maintenance Mapping Merging Integration. Introduction. Why do we use ontology?
E N D
Ontology Engineering & Maintenance Semantic Web - Fall 2005 Computer Engineering Department Sharif University of Technology
Outline • Ontology Engineering • Ontology Maintenance • Mapping • Merging • Integration
Introduction • Why do we use ontology? • To describe the semantics of the data (which we name as Meta-Data) • Why do we describe the semantics? • In order to provide a uniform way to make different parties to understand each other • Which data? • Any data (on the web, or in the existing legacy databases)
Introduction • Formal definition on Ontology: • Ontologies are knowledge bodies that provide a formal representation of a shared conceptualization of a particular domain. • Ontologies are widely used in the Semantic Web. • Recently ontologies have become increasingly common on WWW where they provide semantics of annotations in web pages
What Is “Ontology Engineering”? Ontology Engineering: Defining terms in the domain and relations among them • Defining concepts in the domain (classes) • Arranging the concepts in a hierarchy (subclass-superclass hierarchy) • Defining which attributes and properties(slots) classes can have and constraints on their values • Defining individuals and filling in slot values
determinescope determinescope considerreuse considerreuse considerreuse considerreuse enumerate terms enumerate terms enumerate terms defineclasses defineclasses defineclasses defineclasses defineclasses defineproperties defineproperties defineproperties defineproperties defineconstraints defineconstraints defineconstraints createinstances createinstances createinstances createinstances Ontology-Development Process here: In reality - an iterative process:
Determine Domain and Scope • What is the domain that the ontology will cover? • For what we are going to use the ontology? • For what types of questions the information in the ontology should provide answers? determinescope considerreuse enumerate terms defineclasses defineproperties defineconstraints createinstances
Consider Reuse • Why reuse other ontologies? • to save the effort • to interact with the tools that use other ontologies • to use ontologies that have been validated through use in applications considerreuse determinescope enumerate terms defineclasses defineproperties defineconstraints createinstances
What to Reuse? • Ontology libraries • DAML ontology library (www.daml.org/ontologies) • Ontolingua ontology library (www.ksl.stanford.edu/software/ontolingua/) • Protégé ontology library (protege.stanford.edu/plugins.html) • Upper ontologies • IEEE Standard Upper Ontology (suo.ieee.org) • Cyc (www.cyc.com)
What to Reuse? (II) • General ontologies • DMOZ (www.dmoz.org) • WordNet (www.cogsci.princeton.edu/~wn/) • Domain-specific ontologies • UMLS Semantic Net • GO (Gene Ontology) (www.geneontology.org)
Enumerate Important Terms • What are the terms we need to talk about? • What are the properties of these terms? • What do we want to say about the terms? enumerate terms considerreuse determinescope defineclasses defineproperties defineconstraints createinstances
Define Classes and the Class Hierarchy • A class is a concept in the domain • a class of wines • a class of wineries • a class of red wines • A class is a collection of elements with similar properties • Instances of classes • a glass of California wine you’ll have for lunch defineclasses considerreuse enumerate terms determinescope defineproperties defineconstraints createinstances
Class Inheritance • Classes usually constitute a taxonomic hierarchy (a subclass-superclass hierarchy) • A class hierarchy is usually an IS-A hierarchy: an instance of a subclass is an instance of a superclass • If you think of a class as a set of elements, a subclass is a subset • e.g., Apple is a subclass of Fruit Every apple is a fruit
Top level Middle level Bottom level Levels in the Hierarchy
Modes of Development • top-down – define the most general concepts first and then specialize them • bottom-up – define the most specific concepts and then organize them in more general classes • combination – define the more salient concepts first and then generalize and specialize them
Documentation • Classes (and Properties) usually have documentation • Describing the class in natural language • Listing domain assumptions relevant to the class definition • Listing synonyms • Documenting classes and slots is as important as documenting computer code!
Define Properties (Slots) of Classes • Properties in a class definition describe attributes of instances of the class and relations to other instances Each wine will have color, sugar content, producer, etc. defineproperties determinescope considerreuse enumerate terms defineclasses defineconstraints createinstances
Properties (Slots) • Types of properties • “intrinsic” properties: flavor and color of wine • “extrinsic” properties: name and price of wine • parts: ingredients in a dish • relations to other objects: producer of wine (winery) • Simple and complex properties • simple properties (attributes): contain primitive values (strings, numbers) • complex properties: contain (or point to) other objects (e.g., a winery instance)
Property Constraints (facets) • Property constraints (facets) describe or limit the set of possible values for a property The name of a wine is a string The wine producer is an instance of Winery A winery has exactly one location defineconstraints determinescope considerreuse enumerate terms createinstances defineclasses defineproperties
An Example: Domain and Range DOMAIN RANGE class slot allowed values • When defining a domain or range for a slot, find the most general class or classes • Consider the flavor slot • Domain: Red wine, White wine, Rosé wine • Domain: Wine • Consider the produces slot for a Winery: • Range: Red wine, White wine, Rosé wine • Range: Wine
Create Instances • Create an instance of a class • The class becomes a direct type of the instance • Any superclass of the direct type is a type of the instance • Assign slot values for the instance frame • Slot values should conform to the facet constraints • Knowledge-acquisition tools often check that createinstances determinescope considerreuse enumerate terms defineclasses defineconstraints defineproperties
Defining Classes and a Class Hierarchy • The things to remember: • There is no single correct class hierarchy • But there are some guidelines • The question to ask: “Is each instance of the subclass an instance of its superclass?”
Transitivity of the Class Hierarchy • The is-a relationship is transitive: B is a subclass of A C is a subclass of B C is a subclass of A • A direct superclass of a class is its “closest” superclass
Multiple Inheritance • A class can have more than one superclass • A subclass inherits slots and facet restrictions from all the parents • Different systems resolve conflicts differently
Classes are disjoint if they cannot have common instances Disjoint classes cannot have any common subclasses either Red wine, White wine,Rosé wine are disjoint Dessert wine and Redwine are not disjoint Disjoint Classes Wine Dessert wine Red wine White wine Rosé wine
Avoiding Class Cycles • Danger of multiple inheritance: cycles in the class hierarchy • Classes A, B, and C have equivalent sets of instances • By many definitions, A, B, and C are thus equivalent
The Perfect Family Size • If a class has only one child, there may be a modeling problem • If the only Red Burgundy we have is Côtes d’Or, why introduce the sub-hierarchy? • Compare to bullets in a bulleted list
The Perfect Family Size (II) • If a class has more than a dozen children, additional subcategories may be necessary • However, if no natural classification exists, the long list may be more natural
Class instance-of Instance Single and Plural Class Names • A “wine” is not a kind-of “wines” • A wine is an instance of the class Wines • Class names should be either • all singular • all plural
Classes and Their Names • Classes represent concepts in the domain, not their names • The class name can change, but it will still refer to the same concept • Synonym names for the same concept are not different classes • Many systems allow listing synonyms as part of the class definition
Content: Top-Level Ontologies • What does “top-level” mean? • Objects: tangible, intangible • Processes, events, actors, roles • Agents, organizations • Spaces, boundaries, location • Time • IEEE Standard Upper Ontology effort • Goal: Design a single upper-level ontology • Process: Merge upper-level of existing ontologies
Analysis: semantic consistency Violation of property constraints Cycles in the class hierarchy Terms which are used but not defined Interval restrictions that produce empty intervals (min > max) Analysis: style Classes with a single subclass Classes and slots with no definitions Slots with no constraints (value type, cardinality) Tools for automated analysis Chimaera (Stanford KSL) DAML validator Ontology Analysis
Ontology Evaluation • One of the hardest problems in ontology design • Ontology design is subjective • What does it mean for an ontology to be correct (objectively)? • The best test is the application for which the ontology was designed
Ontology Maintenance • Ontology mapping • Ontology merging • Versioning and evolution • Compatibility between different versions of the same ontology • Compatibility between versions of an ontology and instance data
Ontology Mapping - Introduction • This distributed nature of ontology development has led to a large number of different ontologies covering the same or overlapping domains. • In order to two parties to understand each other, they should use the same formal representation for the shared conceptualization (so the same ontology) • Unfortunately it is not easy to make everybody to agree on the same ontology for a domain. • And when you have different ontologies for the same domain the problem shows up. • Parties with different ontologies do not understand each other. Here comes the ontologymapping into the play
Ontology Mapping • Given two ontologies O1 and O2, mapping one ontology onto another means that for each entity (concept C, relation R, or instance I) in ontology O1, we try to find a corresponding entity, which has the same intended meaning, in ontology O2
People Staff … … Staff Faculty Academic Technical Lecturer Professor Senior Lecturer Asst. Professor Professor Assoc. Professor James Cook PhD, U Sydney Name Education Data Instance Semantic Mapping An Example Find Prof. Cook, a professor in a Seattle college, earlier an assoc. professor at his alma mater in Australia www.cs.washington.edu www.cs.usyd.edu.au
Sim(Fac,Staff) ? Academic Sim(Fac,Acad) Sim(Fac,Prof) Compute Similarity Satisfy Constraints Overview of Process People Staff Staff Faculty Academic Technical Faculty Lecturer Professor Senior Lecturer Asst. Professor Professor Assoc. Professor Define Similarity
Ontology Merging • Creating one new ontology from two or more ontologies. • The most prominent approaches are the unionand the intersectionapproaches • In the union approach, the merged ontology is the union of all entities in both source ontologies, where differences in representation of similar concepts have been resolved • In the intersection approach, the merged ontology consists only of the parts of the source ontology which overlap
From two source ontologies, create a new ontology that: Contains the information from both sources Does not contain redundant information Does not contain inconsistent information Ontology Merging
Ontology Merging • One-to-one mapping of ontologies • Mappings are created between pairs of ontologies • Problems with this approach arise when many such mappings need to be created • The complexity is O(n^2), where n is the number of ontologies • Using a global ontology • Each ontology is mapped to the central ontology • Drawbacks of using a global ontology are similar to those of using any standard • It is hard to reach a consensus on a standard shared by many people who use different terminologies for the same domain • A standard impedes changes in an organization