1.15k likes | 1.3k Views
Semantic Mediation, Ontologies and Scientific Workflows and all the rest (+/– Web Services). Bertram Ludäscher Knowledge-Based Information Systems Lab San Diego Supercomputer Center University of California San Diego. http://seek.ecoinformatics.org. http://www.geongrid.org. Outline.
E N D
Semantic Mediation, Ontologies and Scientific Workflows and all the rest (+/– Web Services) Bertram Ludäscher Knowledge-Based Information Systems Lab San Diego Supercomputer Center University of California San Diego http://seek.ecoinformatics.org http://www.geongrid.org
Outline • Motivation (SEEK, GEON, ..) • Ontologies 101 • Semantic Mediation, Data Registration, … • Application Examples (Stargazing with Kepler…)
Ilkay Altintas SDM Chad Berkley SEEK Shawn Bowers SEEK Jeffrey Grethe BIRN Christopher H. Brooks Ptolemy II Zhengang Cheng SDM Efrat Jaeger GEON Matt Jones SEEK Edward A. Lee Ptolemy II Kai Lin GEON Ashraf Memon GEON Bertram Ludaescher BIRN, GEON, SDM, SEEK Steve Mock NMI Steve Neuendorffer Ptolemy II Mladen Vouk SDM Yang Zhao Ptolemy II … Kepler Team, Projects, Sponsors Ptolemy II
SEEK Science Environment for Ecological Knowledge • EcoGrid • Uniform interfaces to manage environmental data • Kepler • Modeling scientific workflows • Semantic Mediation System • “Smart” data discovery and integration • Knowledge Representation (SEEK-KR) • Classification and Nomenclature (SEEK-TAXON) • Biodiversity and Ecological Analysis and Modeling (SEEK-BEAM)
LUQ AND HBR VCR NTL Building the EcoGrid LTER Network (24) Natural History Collections (>> 100) Organization of Biological Field Stations (180) UC Natural Reserve System (36) Partnership for Interdisciplinary Studies of Coastal Oceans (4) Multi-agency Rocky Intertidal Network (60) Metacat node SRB node VegBank node DiGIR node Xanthoria node Legacy system
Heterogeneous Data integration • Requires advanced metadata and processing • Attributes must be semantically typed • Collection protocols must be known • Units and measurement scale must be known • Measurement relationships must be known • e.g., that ArealDensity=Count/Area
Semantic Mediation • Label data with semantic types • Label inputs and outputs of analytical components with semantic types • Use reasoning engines to generate transformation steps • Beware analytical constraints • Use reasoning engine to discover relevant components Data Ontology Workflow Components
Ecological ontologies • What was measured (e.g., biomass) • Type of measurement (e.g., Energy) • Context of measurement (e.g., Psychotria limonensis) • How it was measured (e.g., dry weight) • SEEK intends to enable community-created ecological ontologies using OWL • Represents a controlled vocabulary for ecological metadata • More about this in Bertram’s talk
Ontologies 101 (based on a tutorial by Shawn Bowers and CSE291) • Ontologies basics • Ontologies and data management • Benefits of ontologies • Constructing ontologies • Breakout Exercises
What are ontologies? It depends on who you ask We focus on the data-management view Generally speaking, an ontology specifies a theory (a model) by … defining and relating …generic concepts representing features of the real or abstract world (a domain of interest) [Bunge]
Concepts, Symbols, and Things • Humans use symbols (e.g., words) to communicate • Words are mapped to things indirectly through concepts that denote (refer to) things Concept “Jaguar” Ogden, C. K. & Richards, I. A. 1923. "The Meaning of Meaning." 8th Ed. New York, Harcourt, Brace & World, Inc [Carole Goble, Nigel Shadbolt]
Concepts, Symbols, and Things Symbols and concepts are not precise • The same symbol can stand for multiple things • The same thing can have multiple symbols • Concepts are usually not well-defined Concept “Jaguar” Ogden, C. K. & Richards, I. A. 1923. "The Meaning of Meaning." 8th Ed. New York, Harcourt, Brace & World, Inc [Carole Goble, Nigel Shadbolt]
Concepts, Symbols, and Things An ontology attempts to define and relate specific concepts for certain sets of things via agreed upon symbols Concept “Jaguar” Ogden, C. K. & Richards, I. A. 1923. "The Meaning of Meaning." 8th Ed. New York, Harcourt, Brace & World, Inc
What are ontologies? Ontologies are typically created to: Commit to a definition (a model) of a domain Explicitly state assumptions concerning the definition Have a wide scope (be general) Support exchange and integration of heterogeneous data sources and applications (more on this later…)
What are ontologies? Ontologies may be expressed Informally using natural language (e.g., in philosophy and sometimes biology) Formally using a mathematical language, e.g., first-order logic We focus on formal ontologies To be precise about what the theory proposes
What are ontologies? Formal ontologies can vary in detail Controlled Vocabulary (list of terms) Simple Thesaurus (synonyms) Thesaurus (broader/narrower terms) Classification (class, instance, is-a, maybe part-of) Classification (value, cardinality constraints) Classification (axioms such as disjoint, union, etc.) Classification (general logic constraints)
What are ontologies? Formal ontologies can vary in detail Controlled Vocabulary (list of terms) Simple Thesaurus (synonyms) Thesaurus (broader/narrower terms) Expressiveness Classification (class, instance, is-a, maybe part-of) Classification (value, cardinality constraints) Classification (axioms such as disjoint, union, etc.) Classification (general logic constraints)
Class, Instance, and Is-a Animal “Every Jaguar is an Animal” x . Jaguar(x) Animal(x) is-a Jaguar Set of things (instances) denoted by the class Animal Set of things (instances) denoted by the class Jaguar
Properties and Cardinality Constraints Animal is-a eats Carnivore A cardinality constraint might state that carnivores must eat at least one Animal Question: Must Jaguars eat at least one Animal? is-a Jaguar
Value Restrictions Animal is-a eats Carnivore A value restriction for Jaguar might restrict the eats propertyto the specific animals eatenby Jaguars is-a Jaguar
Value Restrictions Jaguars restrict the eats relationship to Marsh Deer, … Animal eats Carnivore Herbivore eats Marsh Deer Jaguar
Value Restrictions Does anyone see a problem with this choice of representation? Animal eats Carnivore Herbivore eats Marsh Deer Jaguar
Value Restrictions These different representations propose the same basic underlying theory Animal eats Herbivore Carnivore JaguarFood Marsh Deer Peccary Jaguar eats
What are ontologies? Formal ontologies can vary in detail Controlled Vocabulary (list of terms) Simple Thesaurus (synonyms) Thesaurus (broader/narrower terms) Expressiveness Classification (class, instance, is-a, maybe part-of) Classification (value, cardinality constraints) Classification (axioms such as disjoint, union, etc.) Classification (general logic constraints)
What are ontologies? An (informal) ontology of wine: Wines are potable liquids made by wineries within regions and with specific vintages Wines are characterized by the type of grape they are made with, their color (white, rose, red), their sugar (dry, offdry, or sweet), their body (light, medium, full), and their flavor (delicate, moderate, strong) Sauvignon Blanc, Merlot, Pinot Noir, and Riesling are types of wines [OWL Guide]
Exercise With a partner, take 5 minutes and try to define a “formal” ontology for the wine example • Select two or three classes • Identify some relationships between them • List any constraints (cardinality or value restrictions) that exist between them
What are ontologies? (Philosophy) An ontological theory can answer “ontological” questions • Is Merlot a potable liquid? • Are there wines made of things other than grapes? • How are Pinot Gris and Pinot Noir related? • Are there white wines that are dry, full, and strong made in Napa Valley? We will look at other uses later [Bunge]
Outline • Ontologies basics • Ontologies and data management • Benefits of using ontologies • Constructing ontologies • Breakout Exercises
Ontologies and Data Management Where do ontologies fit within data management architectures? There is no specific answer to this question… However, an ontologyis similar to a schema or conceptual model if one exists, but is • Developed independently of a particular application • Probably given in a different language • Inherently more general • Usually not a very good schema (weak structure)
Ontologies and Data Management( watch out for Semantic Data Registration later) Ontology use concepts from (explicitly or implicitly) Design Artifact Conceptual Model Conceptual Model Schema Schema Schema Schema Metadata Data
Outline • Ontologies basics • Ontologies and data management • Benefits of ontologies • Constructing ontologies • Breakout Exercises
Benefits of ontologies Ontologies are often developed within a community and are interdisciplinary Explicitly capture “knowledge” about a domain • Standard terms (symbols) for metadata values and schema design • Enables advanced searching techniques (via reasoning) • Enables exchange and integration
Benefits of ontologies Ontologies for metadata keywords {sonoma county, wine} {cabernet sauvignon, sonoma county, …} {medium, red, dry, …}
Benefits of ontologies Ontologies for metadata keywords Find information about dry californiared wines {sonoma region, wine} {cabernet sauvignon, sonoma region, …} {medium, red, dry, …} We use the ontology to “expand” and/or “focus” the query, e.g., that cabernet sauvignon is red and dry; sonoma valley is in california
Benefits of ontologies Dataset (wines by regions) What regional characteristics produce the best-selling wines? Dataset (wine sales) Integrate Analysis Dataset (region characteristics) Integration can be extremely complex due to structural (schema and values) and semantic (ontological) differences Ontologies can help!
Benefits of ontologies Dataset (wines by regions) What regional characteristics produce the best-selling wines? Dataset (wine sales) Integrate Analysis Dataset (region characteristics) Registering datasets with ontologies Map structure (schema) to concepts Map data to classes/instances (various ways to do this…) Provides a uniform view of disparate sources
Outline • Ontologies basics • Ontologies and data management • Benefits of ontologies • Constructing ontologies • Breakout Exercises
Constructing ontologies Various Web-based standards are emerging for defining ontologies XML Schema • Mainly for defining “vocabularies” and less-formal ontologies (term-based is-a, some constraints) • Mainly a structural/schema representation • Topic Maps • For advanced thesauri, subject indexes • RDF/RDFS/OWL • Formal ontologies based on description logics (a variant of first-order logic) and semantic networks (more informal)
Resource Description Framework (RDF) Simple data model that consists of • Resources (uniquely identified via URIs) • Properties • Values (resources or character strings) Data organized into triples (subject, property, value) locatedIn CaliforniaRegion SonomaRegion Property (Resource) Subject (Resource) Value (Resource) locatedIn(SonomaRegion, California)
RDF Schema Adds a set of pre-defined properties to define classes and properties Allows instances to be connected to classes Sub-class and sub-property (is-a) relationships Region is a class locatedIn is a property locatedIn connects Regions locatedIn Region rdf:type rdf:type locatedIn CaliforniaRegion SonomaRegion
OWL Adds additional pre-defined properties to further constrain an ontology (See http://www.w3.org/TR/owl-guide/) Note, RDF(S) and OWL use XML Some graphic tools exist (e.g., Protégé) A Vintage is a class that is a subclass of an unnamed class whose instances always have one hasVintageYearproperty. <owl:Classrdf:ID="Vintage"> <rdfs:subClassOf> <owl:Restriction> <owl:onPropertyrdf:resource="#hasVintageYear"/> <owl:cardinality>1</owl:cardinality> </owl:Restriction> </rdfs:subClassOf> </owl:Class> Note the uglified XML syntax… The good news: meant for parsers, not humans!
Description Logic A language and syntax for describing “concept” logics • Concept names C (denote sets of instances) • Class definitions D (denote sets of instances) • Subclass definition C ⊑ D • Equivalence definition C D • Definition constructors • intersection D ⊓ D • union D ⊔ D • Property existence hasProp.D • Property restriction hasProp.D • Cardinality =1 hasProp.D, >1 hasProp.D, <2 hasProp.D
Description Logic Wine ⊑ PotableLiquid ⊔ hasColor.{Red, Rose, White) The class Wine is a sub-class of PotableLiquids that have at least one (exists one) hasColor property whose values are either Red, Rose, or White WhiteWine Wine ⊓ hasColor.{White) WhiteWines are exactly Wines whose color is White WhiteBurgandy ⊑ WhiteWine ⊓ Burgandy The set of WhiteBurgandy wines is a subset of the set of WhiteWines intersected with Burgandy wines SauvignonBlanc ⊑ WhiteWine ⊓ =1 madeFromGrape.SauvignonBlancGrape
Constructing Ontologies In general, creating an ontology is hard • Requires general agreement and understanding of a domain • Requires a clear, concise, and unambiguous definition • May invoke controversy • Is a hard data-modeling problem (complex constraints, broad domain)
Outline • Ontologies basics • Ontologies and data management • Benefits of ontologies • Constructing ontologies • Breakout Exercises
Breakout Exercises Divide into the same groups as yesterday Develop an ontology for the domain you worked on: • Define relevant concepts • Define relationships among concepts • If you have time, work on simple constraints (cardinality, value restrictions) Capture (on paper, or in PPT if you feel ambitious) your ontology in whatever way makes sense to you (e.g., as circle-line drawings or as list of terms and properties). What assumptions did you make in creating your ontology? If you have time, develop a scenario for your ontology in terms of your workflow. For example, to show how your ontology could help integration or query.
Some References Mario Bunge. Treatise on Basic Philosophy, Vol. 3, Ontology I: The Furniture of the World. D. Reidel Publishing Company, 1977. Nicola Guarino. Formal ontology and information systems. In Proc. of Formal Ontology in Information Systems, IOS Press, pp. 3-15, 1998. Thomas R. Gruber. Toward principles for the design of ontologies used for knowledge sharing. In Formal Ontology in Conceptual Analysis and Knowledge Representation, Kluwer Academic Publishers, 1993. Jeffrey Parsons and Yair Wand. Emancipating instances from the tyranny of classes in information modeling. In ACM Transactions on Database Systems, 25(2):228-268, 2000.