1.33k likes | 1.56k Views
Learn about ontology, conceptualization, logic representation, and historical context in creating shared models. Explore development phases, roles, and principles for effective ontology engineering. Dive into tools and techniques for quality control and resource management. Discover the significance of formal specification and granularity in ontology.
E N D
Ontology engineering Valentina Tamma Based on slides by A. Gomez Perez, N. Noy, D. McGuinness, E. Kendal, A. Rector and O. Corcho
Content • Background on ontology; • Ontology and ontological commitment; • Logic as a form of representation; • Ontology development phases; • Modelling problems and patterns • N-ary relationships • Part whole relationships
What Is “Ontology Engineering”? Ontology Engineering: Defining terms in the domain and relations among them • Defining concepts in the domain (classes) • Arranging the concepts in a hierarchy (subclass-superclass hierarchy) • Defining which attributes and properties(slots) classes can have and constraints on their values • Defining individuals and filling in slot values
Methodological Questions • How can tools and techniques best be applied? • Which languages and tools should be used in which circumstances, and in which order? • What about issues of quality control and resource management? • Many of these questions for ontology engineering have been studied in other contexts • E.g. software engineering, object-oriented design, and knowledge engineering
Historical context • Artificial Intelligence • Philosophy • Ontology • Knowledge Representation and logic
Philosophical roots • Socrates questions of being, Plato’s studies of epistemology: – the nature of knowledge • Aristotle’s classifications of things in the world and contribution to syllogism and inductive inference: • logic as a precise method for reasoning about knowledge • Anselm of Canterbury and ontological arguments deriving the existence of God • Descartes, Leibniz, …
In computer science… • Cross-disciplinary field with historical roots in philosophy, linguistics, computer science, and cognitive science • The goal is to provide an unambiguous description of the concepts and relationships that can exist for an agent or a community of agent, so they can understand, share, and use this description to accomplish some task on behalf of users
formal: an ontology should be machine-readable shared: an ontology captures consensual knowledge, that is not private to some individual, but accepted by a group conceptualisation: an abstract model of some phenomenon in the world which identifies the relevant concepts of that phenomenon explicit: the types of concepts used, and the constraints on their use are explicitly defined So what is an ontology then? An ontology is a (formal), explicitspecification of a sharedconceptualisation T. Gruber, 1993; R. Studer, V. R. Benjamins, and D. Fensel, 1998
apple mela What is a conceptualisation • Conceptualisation:the formal structure of reality as perceived and organized by an agent, independently of: • the vocabulary used (i.e., the language used) • the actual occurrence of a specific situation • Different situations involving the same objects, described by different vocabularies, may share the same conceptualisation.
Logic as a representation formalism • Predicate logic is more precise than natural language, but it is harder to read: • “Every trailer truck has 18 wheels” From John F. Sowa: Knowledge Representation: Logical, Philosophical, and Computational Foundations, Brooks/Cole, 2000.
Logic as a representation formalism • Logic is a simple language with few basic symbols. • The granularity of representation depends on the choice of predicates – i.e. anontology of the relevant concepts in the domain. • Different choices of predicates (with different interpretations) represent different ontological commitments. From John F. Sowa: Knowledge Representation: Logical, Philosophical, and Computational Foundations, Brooks/Cole, 2000.
Ontological commitment Agreement on the meaning of the vocabulary used to share knowledge. Apipe ?!? We need apipe
Knowledge engineering • Knowledge engineering is the application of logic and ontology to the task of building computable models of some domain for some purpose. – John Sowa
Level of Granularity An ontology specifies a rich description of the: • Terminology, concepts, vocabulary • Properties explicitly describing concepts • Relations among concepts • Rules distinguishing concepts, refining definitions and relations (constraints, restrictions, regular expressions) relevant to a particular domain or area of interest. Based on the AAAI’99 Ontology Panel – McGuinness, Welty, Uschold, Gruninger, Lehman
Ontology based information systems • Ontologies provide a common vocabulary and definition of rules defining the use of the ontologies by independently developed resources, processes, services • Agreements among companies, organizations sharing common services can be achieved with regard to their usage and the meaning of relevant concepts can be expressed unambiguously
Ontology based information systems • By composing component ontologies, mapping ontologies to one another and mediating terminology among participating resources and services, independently developed systems, agents and services can work together to share information and processes consistently, accurately, and completely.
Ontology based information systems • Ontologies also facilitate conversations among agents to collect, process, merge, and exchange information. • Improve search accuracy by enabling contextual search through the use of concept definitions and relations among them. • Used instead of/in addition to statistical relevance of keywords.
Ontology design process Really more like…
Requirement analysis Performing Requirements, Domain & Use Case Analysis is a critical stage as in any software engineering design. It allows ontology engineers to ground the work and prioritise. The analysis has to elicit and make explicit: • The nature of the knowledge and the questions (competency questions) that the ontology (through a reasoner) needs to answer. This process is crucial for scoping and designing the ontology, and for driving the architecture; • Architectural issues; • The effectiveness of using traditional approaches with knowledge intensive approaches;
Aim: The main goal of this phase is to support the application in dealing with: – Changing assumptions – Hypothesis generation (analogy) – System evolution, or dynamic knowledge evolution - where time and situations change necessitating re-evaluation of assumptions – Support for interoperation with other (potentially legacy) systems − Generation of explanation for dialogue generation – facilitate interface with users − Standardization of terminology: to reflect the engineers different backgrounds Separation of concerns is crucial when dealing with knowledge • Declarative domain knowledge (what?) needs to be treated differently from procedural knowledge (how?) • Ontologies vs Problem solving methods • Background (unchanging) knowledge from changing information • Provenance and level of trust of knowledge
Application requirements Application requirements can be acquired by: • Identifying any controlled vocabulary used in the application; • Identifying hierarchical or taxonomic structures intrinsic in the domain that might be used for query expansion: • Vegetarian pizza such as: margherita, funghi, grilled vegetables pizza • Analysing structured queries and the knowledge they require • Expressive power required: Efficient inference (requiring limited expressive power) vs. increased expressivity (requiring expensive or resource bounded computation) • Ad-hoc reasoning to deal with particular domain requirements: • temporal relations, geospatial, process-specific, conditional operations • Computational tractability • Need for Explanations, Traces, Provenance
Domain requirements • Take into account heterogeneity, distribution, and autonomy needs • software agents based applications; • Open vs. Closed World (does lack of information imply negative information?) • Static vs dynamic ontology processes: • Evolution, alignment • Limited or incomplete knowledge • Knowledge evolution over time • Analysis and consistency checking of instance data • Use Case analysis should facilitate the understanding of: – The information that is likely to be available – The questions that are likely to be asked – Types and roles of users
Conceptual modelling “A data model describes data, or database schemas – an ontology describes the world” Adam Farquhar, “Ontology 101”, Stanford University, 1997 • Resources and their relationships are described from an objective standpoint, and they do not reflect the definitions in databases, or the views of programmers. • Experts from different backgrounds with significant domain knowledge – will classify knowledge differently from someone interested in optimization of algorithms, or forcing information into an existing framework, or legacy applications • Shortcuts at the top levels do not help; automation and mapping among ontologies and terminology at lower levels provides significant benefit
Determine ontology scope Addresses straight forward questions such as: • What is the ontology going to be used for • How is the ontology ultimately going to be used by the software implementation? • What do we want the ontology to be aware of, and what is the scope of the knowledge we want to have in the ontology?
Competency Questions • Which investigations were done with a high-fat-diet study? • Which study employs microarray in combination with metabolomics technologies? • List those studies in which the fasting phase had as duration one day. • What is a vegetarian pizza? • What type of wine can accompany seafood?
Consider Reuse • We rarely have to start from scratch when defining an ontology: • There is almost always an ontology available from a third party that provides at least a useful starting point for our own ontology • Reuse allows to: • to save the effort • to interact with the tools that use other ontologies • to use ontologies that have been validated through use in applications
Consider Reuse • Standard vocabularies are available for most domains, many of which are overlapping • Identify the set that is most relevant to the problem and application issue • A component-based approach based on modules facilitates dealing with overlapping domains: • Reuse an ontology module as one would reuse a software module • Standards; complex relationships are defined such that term usage and overlap is unambiguous and machine interpretable • Initial brainstorming with domain experts can be highly productive; then subsequent refinement and iteration lead to the level required by the application
What to Reuse? • Ontology libraries • DAML ontology library (www.daml.org/ontologies) • Protégé ontology library (protege.stanford.edu/plugins.html) • Upper ontologies • IEEE Standard Upper Ontology (suo.ieee.org) • Cyc (www.cyc.com) • General ontologies • DMOZ (www.dmoz.org) • WordNet (www.cogsci.princeton.edu/~wn/) • Domain-specific ontologies • UMLS Semantic Net • GO (Gene Ontology) (www.geneontology.org)
Enumerate terms • Write down in an unstructured list all the relevant terms that are expected to appear in the ontology • Nouns form the basis for class names • Verbs (or verb phrases) form the basis for property names • Card sorting is often the best way: • Write down each concept/idea on a card • Organise them into piles • Link the piles together • Do it again, and again • Works best in a small group
Example: animals & plants ontology • Dog • Cat • Cow • Person • Tree • Grass • Herbivore • Male • Female • Carnivore • Plant • Animal • Fur • Child • Parent • Mother • Father • Dangerous • Pet • Domestic Animal • Farm animal • Draft animal • Food animal • Fish • Carp • Goldfish
Define classes and their taxonomy • A class is a concept in the domain: –Animal (cow, cat, fish) – A class of properties (father, mother) • A class is a collection of elements with similar properties • A class contains necessary conditions for membership (type of food, dwelling) • Instances of classes – A particular farm animal, a particular person – Tweety the penguin
Dog Cat Cow Person Tree Grass Herbivore Male Female Healthy Pet Domestic Animal Farm animal Draft animal Food animal Fish Carp Goldfish Organise the conceptsExample: Animals & Plants • Carnivore • Plant • Animal • Fur • Child • Parent • Mother • Father
Extend the concepts: “Laddering” • Take a group of things and ask what they have in common • Then what other ‘siblings’ there might be • e.g. • Plant, Animal Living Thing • Might add Bacteria and Fungi but not now • Cat, Dog, Cow, Person Mammal • Others might be Goat, Sheep, Horse, Rabbit,… • Cow, Goat, Sheep, Horse Hoofed animal (“Ungulate”) • What others are there? Do they divide amongst themselves? • Wild, Domestic Domestication • What other states – “Feral” (domestic returned to wild)
Choose some main axes • Add abstractions where needed • e.g. “Living thing” • identify relations (this feeds into the next step) • e.g. “eats”, “owns”, “parent of” • Identify definable things • e.g. “child”, “parent”, “Mother”, “Father” • Things where you can say clearly what it means • Try to define a dog precisely – very difficult • A “natural kind” • make names explicit
Living Thing Animal Mammal Cat Dog Cow Person Fish Carp Goldfish Plant Tree Grass Fruit Modifiers domestic pet Farmed Draft Food Wild Health healthy sick Sex Male Female Age Adult Child Example • Relations • eats • owns • parent-of • … • Definable • Carinvore • Herbivore • Child • Parent • Mother • Father • Food Animal • Draft Animal
Identify self-standing entities • Things that can exist on there own • People, animals, houses, actions, processes, … • Roughly nouns • Modifiers • Things that modify (“inhere”) in other things • Roughly adjectives and adverbs
Self_standing Living Thing Animal Mammal Cat Dog Cow Person Pig Fish Carp Goldfish Plant Tree Grass Fruit Modifiers Domestication Domestic Wild Use Draft Food pet Risk Dangerous Safe Sex Male Female Age Adult Child Reorganise everything but “definable” things into pure trees – these will be the “primitives” • Relations • eats • owns • parent-of • … • Definables • Carnivore • Herbivore • Child • Parent • Mother • Father • Food Animal • Draft Animal
Self_standing Living Thing Animal Mammal Cat Dog Cow Person Pig Fish Carp Goldfish Plant Tree Grass Fruit Comments can help to clarify • Abstract ancestor concept including all living things – restrict to plants and animals for now
Class inheritance • Classes are organized into subclass-superclass (or generalization-specialization) Hierarchies: • Classes are “is-a” related if an instance of the subclass is an instance of the superclass • Classes may be viewed as sets • Subclasses of a class are comprised of a subset of the superset • Examples • Mammal is a subclass of Animal • Every penguin is a bird or every instance of a penguin (like Tweety is an instance of bird • Draft animal is a subclass of Animal
Levels in the class hierarchy • Different modes of development – Top-down - define the most general concepts first and then specialize them • Bottom-up - define the most specific concepts and then organize them in more general classes • Combination (typical – breadth at the top level and depth along a few branches to test design) • Class inheritance is Transitive – A is a subclass of B – B is a subclass of C – therefore A is a subclass of C
Top level Middle level Bottom level Levels in the class hierarchy
Define properties • Often interleaved with the previous step • Properties (or roles in DL) describe the attributes of the members of a class • The semantics of subClassOf demands that whenever A is a subclass of B, every property statement that holds for instances of B must also apply to instances of A • It makes sense to attach properties to the highest class in the hierarchy to which they apply
Define properties • Types of properties – “intrinsic” properties: flavor and color of wine – “extrinsic” properties: name and price of wine – parts: ingredients in a dish – relations to other objects: producer of wine (winery) • They are represented by data and object properties – simple (datatype) contain primitive values (strings, numbers) – complex properties contain other objects (e.g., a winery instance)