690 likes | 845 Views
Introduction to Knowledge Representation and Formal Ontologies. Early Career Faculty and Postdoctoral Training Shawn Bowers Genome Center University of California, Davis (sbowers@ucdavis.edu). Our schedule …. 8:30 – 9:30 Introduction to knowledge representation
E N D
Introduction to Knowledge Representation and Formal Ontologies Early Career Faculty and Postdoctoral Training Shawn Bowers Genome Center University of California, Davis (sbowers@ucdavis.edu)
Our schedule … 8:30 – 9:30 Introduction to knowledge representation 9:30 – 10:30 Introduction to ontologies and concept mapping 11:15 – 12:00 Concept mapping exercise 12:00 – 1:15 LUNCH 1:15 – 2:00 Exercise continued 2:00 – 2:30 Report back 2:30 – 2:45 BREAK 2:45 – 5:00 Exercised continued (Protégé)
Outline • Preliminaries • Ontologies • Data and Knowledge • Do-It-Yourself Ontologies Part I • Do-It-Yourself Ontologies Part II
Outline • Preliminaries • Ontologies • Data and Knowledge • Do-It-Yourself Ontologies Part I • Do-It-Yourself Ontologies Part II
What is KR? • A sub-field of Computer Science (AI) and Logic • Deals with all aspects of storing “knowledge” • Codify a domain of interest • Leverage the specifications (inference/deduction) • Typical mechanisms to represent a domain: • Rules: if “x scored more than y” then “x wins” • Concepts: Baseball, Sport, Baseball is-a Sport • Properties: blue jerseyColor, mvp winner • Facts: “Mariners scored more than Padres”
Sets • Sets are collections of unique elements • {1, 2, 3} is a set • 1{1, 2, 3}, 2{1, 2, 3}, etc. • {1, 1} {1} is a set • {} is the empty set • {{1, 2}, 3} is a set that contains the set {1, 2} • {1, 2} {1, 2, 3} (subset relation) • Set Intersection, Union, Difference A B AB AB AB
Propositional Logic • A symbol P, Q, ... is either true or false • A symbol is a proposition – a statement that is either true or false • For example, P = “Socrates is a human” • Formulas (propositions) are built from operators: P Q (true iff P and Q are true) P Q (true iff either P is true or Q is true) P (true iff P is false) P Q (false iff P is true and Q is false) P Q (same as P Q Q P) () (for precedence)
Propositional Logic:Building Sentences “Mary is playing Baseball and it is raining” • B = Mary is playing Baseball • R = It is raining • B R “If Mary is playing Baseball then it is raining” • B R “Mary is playing Baseball or Soccer and it is raining” • R = Mary is playing Soccer • (P R) Q “Mary only plays Baseball when it is raining” • R B (this is the same as: R B)
Propositional Logic:Formal Reasoning • Tautologies: formulas that are always true A A • Modus Ponens (an example inference rule): A B, A • B 1. If Mary is playing baseball then it is raining. P Q 2. Mary is playing baseball. P____ • Therefore, it is raining. Q • Reasoning can be fully automated in propositional logic
Propositional Logic:Exercise • With a partner, formalize these sentences: • Red wine is produced from red grapes • White wine is produced from red or white grapes • Good wines age well • A good wine is an expensive wine and an expensive wine is a good wine • Inexpensive wines don’t age well • … (Make up your own sentence)
Predicate Logic • First-Order Predicate Logic • Can further describe sentences / propositions • More expressive (predicates, objects) • Can describe systems (models/domains) • A predicate describes a property of an object • person(mary) “mary is a person” • plays(mary, baseball) “mary plays baseball” • Unary predicates classify objects, binary and beyond relate objects
Predicate Logic • Quantifiers (existential, universal) • x person(x) plays(x, baseball) “there exists some person that plays baseball” • x person(x) plays(x, baseball)“every person plays baseball” • x person(x) y plays(x, y) sport(y) “every person plays a sport” • Reasoning 1. Socrates is human human(socrates) 2. All humans are mortalx human(x) mortal(x) • Socrates is mortal mortal(socrates) • Reasoning cannot be fully automated in first-order predicate logic
Predicate Logic • What is the difference between: x human(x) mortal(x) x human(x) mortal(x)
Predicate Logic • What is the difference between: x human(x) mortal(x) “All humans are mortal” x human(x) mortal(x) “Everything is mortal and human”
Predicate Logic:Exercise • With a partner, formalize these sentences: • Riesling is a wine • All wines are potable liquids • The color of wine is either red, white, or rose • Some wines have a strong flavor • There are no wines that have a light body and a strong flavor • … (Make up your own sentence)
Outline • Preliminaries • Ontologies • Data and Knowledge • Do-It-Yourself Ontologies Part I • Do-It-Yourself Ontologies Part II
Knowledge Representation: Ontologies • What is KR? • Codify a domain of interest • Some part of the “real or abstract world” • Leverage the specifications (inference/deduction) • An ontology: Specifies a theory (a “model”) … by defining and relating general concepts … representing features of the world (a domain)
Concepts, Symbols, and Things • Humans use symbols (e.g., words) to communicate • Words are mapped to things indirectly through concepts that denote (refer to) things Concept “Jaguar” Ogden, C. K. & Richards, I. A. 1923. "The Meaning of Meaning." 8th Ed. New York, Harcourt, Brace & World, Inc [Carole Goble, Nigel Shadbolt]
Concepts, Symbols, and Things Symbols and concepts are imprecise • The same symbol can stand for multiple things • The same thing can have multiple symbols • Concepts are usually not well-defined Concept “Jaguar” Ogden, C. K. & Richards, I. A. 1923. "The Meaning of Meaning." 8th Ed. New York, Harcourt, Brace & World, Inc [Carole Goble, Nigel Shadbolt]
Concepts, Symbols, and Things An ontology attempts to define and relate specific concepts for certain sets of things via agreed upon symbols Concept “Jaguar” Ogden, C. K. & Richards, I. A. 1923. "The Meaning of Meaning." 8th Ed. New York, Harcourt, Brace & World, Inc [Carole Goble, Nigel Shadbolt]
What are ontologies? Ontologies are typically created to: Committo a definition (a model) of a domain Explicitly stateassumptions concerning the definition Have a wide scope (be general) Support exchange and integration of heterogeneous data sources and applications (more on this later…)
What are ontologies? Ontologies may be expressed Informally using natural language (e.g., in philosophy and sometimes biology) Formally using a mathematical language, e.g., first-order logic (or a fragment)
What are ontologies? Ontologies may be expressed Informally using natural language (e.g., in philosophy and sometimes biology) Formally using a mathematical language, e.g., first-order logic (or a fragment) We focus on formal ontologies To be precise about what the theory proposes To enable automated processing
What are ontologies? Formal ontologies can vary in detail Controlled Vocabulary (list of terms) Simple Thesaurus (synonyms) Thesaurus (broader/narrower terms) Classification (class, instance, is-a, maybe part-of) Classification (value, cardinality constraints) Classification (axioms such as disjoint, union, etc.) Classification (general logic constraints)
What are ontologies? Formal ontologies can vary in detail Controlled Vocabulary (list of terms) Simple Thesaurus (synonyms) Thesaurus (broader/narrower terms) Expressiveness Classification (class, instance, is-a, maybe part-of) Classification (value, cardinality constraints) Classification (axioms such as disjoint, union, etc.) Classification (general logic constraints)
What are ontologies? • A conceptualization proposes a theory of the domain of interest • An ontology is a (possibly incomplete) representation of the conceptualization set of all theories that can be expressed in the language ontology Conceptualization (of the theory) [Guarino]
Class, Instance, and Is-a “Mulac is a Jaguar” Jaguar(Mulac) Jaguar instance-of Mulac An instance named “Mulac” in the set “Jaguar” Set of things (instances) denoted by the class Jaguar
Class, Instance, and Is-a “Every Jaguar is a kind of Animal” xJaguar(x) Animal(x) Animal is-a Set of things (instances) denoted by the class Animal Jaguar Set of things (instances) denoted by the class Jaguar
Defined Properties Carnivores can be related to other animals through the “eats” property Animal is-a eats x,y eats(x, y) Carnivore(x) Animal(y) “if x eats y and x is a carnivore then y is an animal” Carnivore is-a Jaguar NOTE: Is-a is also a property that relates classes, but it has a specific, formal interpretation!
Value Restrictions Animal We are restricting the eats relationship for Carnivore’s to only Animals (we are saying that carnivore’s only eat animals …) is-a eats Carnivore is-a Jaguar
Value Restrictions Jaguars restrict the eats relationship to Marsh Deer x,y eats(x, y) Jaguar(x) MarshDeer(y) Animal eats Carnivore Herbivore eats Marsh Deer Jaguar QUESTION: Does this definition violate the other eats property?
Value Restrictions • The formulas … 1. x Carnivore(x) Animal(x) 2. x Jaguar(x) Carnivore(x) 3. x Herbivore(x) Animal(x) 4. x MarshDeer(x) Herbivore(x) 5. x,y eats(x, y) Carnivore(x) Animal(y) 6. x,y eats(x, y) Jaguar(x) MarshDeer(y) If x is a Jaguar, from (2) we know x is a Carnivore If y is a MarshDeer, from (4) we know y is a Herbivore, and if y is a Herbivore, from (3) we know y is an Animal
Value Restrictions x,y eats(x, y) Jaguar(x) ( MarshDeer(y) Peccary(y) ) Animal eats Carnivore Herbivore eats eats Marsh Deer Jaguar Peccary
Value Restrictions QUESTION: Does anyone see a potential problem with this choice of representation? Animal eats Carnivore Herbivore eats eats Marsh Deer Jaguar Peccary
Value Restrictions Animal These different representations propose the same basic underlying theory eats Herbivore Carnivore JaguarFood Marsh Deer Peccary Jaguar eats
Cardinality Constraints 1:n Animal Restrict the number of participants in a property Here we say a Carnivoremust eat at least one Animal is-a eats Carnivore is-a Jaguar
Basic Properties • Defines a primitive “characteristic” of the concept • We sometimes call these “properties” (datatype properties) • And those between concepts “relationships” (object properties) • The cardinality constraint states that Jaguars have exactly one spotCount property spotCount Jaguar int 1:1
Basic Properties • We can restrict values of (datatype) properties • Is our Jaguar-friend Mulac a FiveSpottedJaguar? • Depends … spotCount Jaguar int 1:1 spotCount 5 FiveSpottedJaguar spotCount 5 Mulac
What are ontologies? Formal ontologies can vary in detail Controlled Vocabulary (list of terms) Simple Thesaurus (synonyms) Thesaurus (broader/narrower terms) Expressiveness Classification (class, instance, is-a, maybe part-of) Classification (value, cardinality constraints) Classification (axioms such as disjoint, union, etc.) Classification (general logic constraints)
What are ontologies? • An (informal) ontology of wine: • Wines are potable liquids made by wineries within regions and with specific vintages • Wines are characterized by the type of grape they are made with, their color (white, rose, red), their sugar (dry, offdry, or sweet), their body (light, medium, full), and their flavor (delicate, moderate, strong) • Sauvignon Blanc, Merlot, Pinot Noir, and Riesling are types of wines [OWL Guide]
Exercise • With a partner, take 10 minutes and try to define a “formal” ontology for the wine example • Select two or three classes • Identify some relationships between them • List any constraints (cardinality or value restrictions) that exist between them
Why ontologies? (Philosophy) An ontological theory can answer “ontological” questions • Is Merlot a potable liquid? • Are there wines made of things other than grapes? • How are Pinot Gris and Pinot Noir related? • Are there white wines that are dry, full, and strong made in Napa Valley? There are more uses … [Bunge]
Outline • Preliminaries • Ontologies • Data and Knowledge • Do-It-Yourself Ontologies Part I • Do-It-Yourself Ontologies Part II
Ontologies and Data Management How do ontologies fit with data management? • Ontologies are kind of similar to conceptual schemas (e.g., E-R diagrams / DB designs …) • But really they serve different purposes: • Developed independently of a particular application • Given often in a different language • Inherently more general and definitional • Usually not very good schemas • Designed to help support interchange …
Information Architectures Ontology use concepts from (explicitly or implicitly) Design Artifact Conceptual Model Conceptual Model Schema Schema Schema Schema Metadata Data
Single-world scenarios … Ecology Specific Project (LTER project xyz) Scientific Question (Plant Productivity) Experiment Design (Nitrogen Fertilization over specific plots) Actual Experiments Actual Experiments Actual Experiments (Different times, plots, samples, etc.) The data/schemas are generally uniform: Same assumptions, same schemas, … Data Collected
Multi-world scenarios … Ecology Project 1 Project 2 Project 3 • Integrating across different experiments (e.g., different treatments, plots, species, methods) requires extra knowledge … • Ontologies can provide the “glue” to resolve differences • To understand the assumptions • To relate the treatments, methods, and so on …
Benefits of ontologies • Ontologies are often developed within a community and are interdisciplinary • Explicitly capture “knowledge” about a domain • Standard vocabulary • Allow “sentences” over that vocabulary • Enable advanced searching techniques (via reasoning) • Enable exchange and integration
Benefits of ontologies Ontologies for metadata keywords {sonoma county, wine} {cabernet sauvignon, sonoma county, …} {medium, red, dry, …}
Benefits of ontologies Find information about dry californiared wines {sonoma region, wine} {cabernet sauvignon, sonoma region, …} {medium, red, dry, …} We use the ontology to “expand” the query -- cabernet sauvignon is red and dry; sonoma valley is in california