460 likes | 546 Views
An Overview of Ontologies and their Practical Applications. Gianluca Correndo correndo@di.unito.it http://www.di.unito.it/~correndo. What is an Ontology?. Ontology. Semantics – the meaning of meaning.
E N D
An Overview of Ontologies and their Practical Applications Gianluca Correndo correndo@di.unito.it http://www.di.unito.it/~correndo
Ontology • Semantics – the meaning of meaning. • Philosophical discipline, branch of philosophy that deals with the nature and the organisation of reality.
In Computer Science … • An ontology is an explicit specification of a conceptualization [Gruber] • Defines • A common vocabulary of terms • Some specification of the meaning of the terms • A shared understanding for people and machines
Why develop an ontology? • To make domain assumptions explicit • Easier to change domain assumptions • Easier to understand and update legacy data • To separate domain knowledge from operational knowledge • Re-use domain and operational knowledge separately • A community reference for applications (standards) • To share a consistent understanding of what information means
Syntax is not enough for machine communication, e.g. B2B Order information: <Product> <type>Car</type> <Name>Daimler 500 SLK </Name> <Price>23.000 $</Price> </Product> Bestellinformation: <Auto> <Name>Daimler 500 SLK </Name> <Preis>27.000 </Preis> </Auto> Communication
animal domestic vermin dog cat cow rodent eats mouse A Specification of a Conceptualization • Concepts (class, set, type, predicate) • Event, gene,molecule, cat • Properties of concepts and relationships between them (slot) • Taxonomy: generalisation ordering among concepts isA, partOf, subProcess • Relationship, role or attribute: functionOf, hasActivity location, eats, size
What is a concept? Different communities have different notions on what a concept means: • Formal concept analysis talk about formal concepts • Description Logics talk about concept labels • ISO-704:2000 – Terminology Work • Often the classical notion of a frame in AI or a class in OO modeling is seen as equivalent to a concept.
animal domestic vermin dog cat cow rodent eats mouse An explicit description of a domain • Constraints or axioms on properties and concepts: • value: integer • domain: cat • cardinality: at most 1 • range: 0 <= X <= 100 • oligonucleotides < 20 base pairs • cows are larger than dogs • cats cannot eat only vegetation • cats and dogs are disjoint • Values or concrete domains • integer, strings • 20, tryptophan
animal domestic vermin dog cat cow rodent eats mouse felix tom mickey jerry An explicit description of a domain • Individuals or Instances • sulphur, trpA Gene, felix • Nominals • Concepts that cannot have instances • Instances that are used in conceptual definitions • ItalianDog = Dog bornInItaly • Instances • An ontology = concepts + properties + axioms + values + nominals • A knowledge base = ontology+instances
Lightweight Concepts, atomic types Is-a hierarchy Relationships between concepts Heavyweight Metaclasses Type constraints on relations Cardinality constraints Taxonomy of relations Reified statements Axioms Semantic entailments Expressiveness Inference systems Light and Heavy expressivity A matter of rigour and representational expressivity
Carl von Linné (1707-1778) Aristotele (384 b.C. – 322 b.C. ) • Regno Animalia • TipoChordata • ClasseMammalia • OrdinePrimates • FamigliaHominidae • Genere Homo • Speciesapiens • Science of Being (Metaphysics, IV,1) • What is being? • What are the features common to all beings?
General Logical constraints Frames (properties) Formal Is-a Thesauri Catalog/ ID Disjointness, Inverse, partof Formal instance Informal Is-a Terms/ glossary Value restrictions So what is an ontology?
…Things in Common • They are approaches to help structure, classify, model, and/or represent the concepts and relationships pertaining to some subject matter of interest to some community. • They are intended to enable a community to come to agreement and to commit to use the same terms in the same way. • The meaning of the terms is specified in some way and to some degree.
Glossary Catalog
Thesauri similarTo Vegetable Fruit Example: NarrowerTerm Orange Apfelsine (german) synonymWith • Graph with labels edges (similar, nt, bt, synonym) • Fixed set of edge labels (aka relations) • Use of lexical stem • no instances • Well known in library science • cf. terminologies / classifications (Dewey)
UMLS (Unified Medical Language System) http://umlsks.nlm.nih.gov/ • National Library of Medicine (NLM) database of medical terminology. Terms from several medical databases (MEDLINE, SNOMED International, MeSH, etc.) are unified so that different terms are identified as the same medical concept. • Metathesaurus provides the concordance of medical concepts: 730.000 concepts, 1.5 million concept names in different source vocabularies • Specialist Lexicon provides word synonyms, derivations, lexical variants, and grammatical forms of words used in MetaThesaurus terms: 130.000 entries. • Semantic Network codifies the relationships (e.g. causality, "is a", etc.) among medical terms: 134 semantic types, 54 relationships. • Used for: patient data creation, curriculum analysis, natural language processing, and information retrieval
UMLS Metathesaurus Information System DB
UMLS Metathesaurus Information System 1 Information System 2
Frames, SDM, OO models • Frames • Rich set of language constructs: frames, slots, facets, defaults • Impose restrictive constraints on how they are combined or used to define a class • All frames asserted into taxonomy by hand • All concepts are primitive • Octet/GKB, Protégé, OCML, Ontolingua • OKBC – Open Knowledge Base Connectivity • OKBC – Lite • OO / Semantic Data Models (EER, UML) • Taxonomy/inheritance – semantics • Intuitive, lots of tools, widely used
Frame Data Model • Frames • Classes: genes, reactions • Instances: lr10 • Relationships • Slots: chromosome, map-position, citations, reactants, products, Keq • Facets: chromosome is single-valued, instance of class chromosomes; Citations is multiple valued, set of strings
Description Logics • A family of logic based knowledge representation formalisms • Descendants of semantic networks and KL-ONE • Describe domain in terms of concepts (set of individuals), roles (relationships) and individuals • Distinguished by: • Formal semantics (typically model theoretic) • Decidable fragments of FOL • Closely related to propositional modal & dynamic logics • Provision of inference services • Sound and complete decision procedures for key problems • Implemented systems (highly optimised)
Description Logic Family • DLs are a family of logic based KR formalisms • Particular languages mainly characterised by: • Set of constructors for building complex concepts and roles from simpler ones • Set of axioms for asserting facts about concepts, roles and individuals • ALC is the smallest DL that is propositionally closed • Constructors include booleans (and, or, not), and • Restrictions on role successors • E.G., Concept describing “happy fathers” could be written: Man hasChild.Female hasChild.Male hasChild.(Rich happy)
DL Concept and Role Constructors • Range of other constructors found in DLs, including: • Number restrictions (cardinality constraints) on roles, e.g., 3 hasChild, 1 hasMother • Qualified number restrictions, e.g., 2 hasChild.Female, 1 hasParent.Male • Nominals (singleton concepts), e.g., {Italy} • Concrete domains (datatypes), e.g., hasAge.(21), earns spends.< • Inverse roles, e.g., hasChild– (hasParent) • Transitive roles, e.g., hasChild* (descendant) • Role composition, e.g., hasParent o hasBrother (uncle)
Primitive concepts - in a hierarchy Described but not defined Properties - relations between concepts, also in a hierarchy Constructors – on concepts and properties “Some”, “only”, “at least”, “at most”, and, or, not Defined concepts Made from primitive concepts, constructors and descriptors Enzyme protein and catalyses reaction Reason that enzyme is a kind of protein “Is-kind-of” = “implies” “Dog is a kind of wolf” mean “all dogs are wolves” Axioms disjointness, further description of defined concepts A Reasoner To organise it for you. Consistency & taxonomy for defined concepts established though logical reasoning What’s in a “Logic based ontology”?
Reasoning support in DL • Consistency — check if knowledge is meaningful • Subsumption — structure knowledge, compute taxonomy • Equivalence — check if two classes denote same set of instances • Instantiation — check if individual i instance of class C • Retrieval — retrieve set of individuals that instantiate C Problems all reducible to consistency (satisfiability): FACT, racer, cerebra
Formal Ontology Applications • Ontology engineering support • Semantic web • Intelligent information retrieval • E-Commerce • Intelligent web-services • Agent technologies
Problems with Information Retrieval • Working with the Web is currently done at a very low level: • Clicking on links and using keyword search for links is the main (if not only) navigation technique • Keyword-based search engines • (Alta Vista, Infoseek, Yahoo, MetaCrawler, Google)
Problems with Information Retrieval • Main burden of information retrieval is that it is only information retrieval. • It helps to retrieve information sources but the human user has to manually extract and interpret the information. • Information presentation and maintenance is not supported.
Semantic Web Vision • Express explicitly a high level description of resources accessible via Web • More processable data availabe • Information more directly available • Enabling intelligent Web features
DAML-S: Ontology language • Build upon the well-defined semantics of DAML+OIL • Is expected to provide a common understanding of the semantic in a web-service • By specifing an ”Upper Ontology for Services”
An Upper Ontology for Services • Three essential types of knowledge about a service, each characterized by the question it answers: • What does the service require of the user(s),and provide for them? • How does it work? • How is it used?
Globalontology DB DB DB Ontology for data interoperability • Ontology-based Information Integration (TAMBIS) • Spread a query over different and heterogeneous data sources • Quite used in gene ontology applications but not only…
Thesauri & Classification • UNSPSC: United Nations Standard Products and Services Code • Provides structrue and a unique identification of terms • Thesauri act as a good starting point for developing an ontology