1 / 69

OntoClean Methodology

OntoClean. OntoClean Methodology. As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group. Credits. CNR - Ontology and Conceptual Modelling Groups Nicola Guarino , Claudio Masolo , Alessandro Oltramari

august
Download Presentation

OntoClean Methodology

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. OntoClean OntoClean Methodology As presented at AOS Workshop by Aldo Gangemi CNR-IP, Ontology and Conceptual Modelling Group

  2. Credits CNR - Ontology and Conceptual Modelling Groups Nicola Guarino, Claudio Masolo, Alessandro Oltramari {Nicola.Guarino,claudio.masolo,alessandro.oltramari}@ladseb.pd.cnr.it Aldo Gangemi, Domenico Pisanelli, Geri Steve {gangemi,pisanelli,steve}@itbm.rm.cnr.it Vassar College Chris Welty weltyc@vassar.edu • Publications are retrievable on-line at the following URLs: http://www.ladseb.pd.cnr.it/infor/ontology/ontology.html http://saussure.irmkant.rm.cnr.it/onto/index.html

  3. Some Applications so far • ON9 ontology library • UMLS Metathesaurus semantic mining • Medical terminologies integration • Integration of Clinical Guidelines Standards • Ontological upgrading of Wordnet • Ontological Web Agents • Product ontologies • OntoClean Top-Level • Legal ontologies and norm dynamics • Portal directories maintenance and subject ontology • Content standards for the semantic web

  4. Some Projects so far • GALEN (EU AIM Project, academic/industrial project for a medical terminology server) • SOLMC (Ontological and Conceptual Modelling Tools, CNR Special Project) • Arianna Catalog (industrial pilot project to build and maintain portal directories) • IKF, Intelligent Knowledge Fusion (Eureka Project E!2235, academic/industrial project for information integration) • IIDEAS, Integration of Industrial Data for Exchange, Access, and Sharing • IEEE Standard Upper Ontology Study Group • OntoWeb: Ontology-based information exchange for knowledge management and electronic commerce (EU Network of Excellence) • OntoWeb SIG on Content Standards • TICCA (Italian Project, Cognitive technologies for artificial agents)

  5. OntoClean Antecedents • Guarino and Welty’s theoretical tools for the ontological refinement of taxonomies • ONIONS techniques for domain ontology development and large-scale terminology integration

  6. OntoClean Components • Formal Criteria • Top-Level Ontology • Ontology of Universals • Domain-Level Development Guidelines • Applications

  7. OntoClean Components • Formal Criteria (brief explanation) • Top-Level Ontology • Ontology of Universals • Domain-Level Development Guidelines • Applications

  8. Individuals and Concepts • The term "meta-property" adopted here is based on a fundamental distinction within the domain of discourse: • individuals or particulars vs. • concepts or universals • Meta-level properties induce distinctions among concepts, while object-level properties induce distinctions among individuals

  9. Rigidity • A property is essential to an individual iff it necessarily holds for that individual • A property is rigid (+R) iff, necessarily, it is essential to all its instances. A property is non-rigid (-R) iff it is not essential to some of its instances, and anti-rigid (~R) iff it is not essential to all its instances • Person vs Student

  10. Identity • A property carries an identity criterion (+I) iff all its instances can be (re)identified by means of a suitable sameness relation. A property supplies an identity criterion iff such criterion is not inherited by any subsuming property • Person vs. Student

  11. Dependence • An individualx is constantly dependent on y iff, at any time, x can't be present unless y is fully present, and y is not part of x. Ex: Hole/Host • A property P is constantly dependent (+D) iff, for all its instances, there exists something they are constantly dependent on. • Here Dependent = Constantly Dependent

  12. Types vs. Roles • A rigid property that supplies an identity criterion and is not (notionally) dependent is called a type. • An anti-rigid property that is notionally dependent is called a role. It is a material role if it carries (but not supplies) an identity criterion, and a formal role otherwise. • Person vs. Student vs. Part

  13. Extensionality • An individual is said to be extensional iff, necessarily, everything that has the same proper parts is identical to it • A property is extensional (+E) iff, necessarily, all its instances are extensional • A property is anti-extensional (~E) iff, necessarily, all its instances are non-extensional, so that they can possibly change some parts while keeping their identity

  14. Concreteness • An individual is concrete iff it has a physical location. A property whose instances are necessarily concrete will be marked with the meta-property +C • Note that an individual can be concrete without being necessarily real, or actual: Peter Pan is not real but is concrete • This meta-property is a bit less formal (in the ontological sense) than the previous ones, since it makes an ontological commitment towards the existence of physical (spatial, temporal or spatio-temporal) locations. We see physical locations as primitive qualities that individuals can have

  15. Unity • An individual is unified by a (suitably constrained) relation R iff it is a mereological sum of entities that are bound together by R. Ex. the relation having the same boss may unify a group of employees in a company • An individual w is a whole under R iff it is maximally unified by R, in the sense that R is internal to w, and no part of w is linked by R to something that is not part or w • A property P is said to carry unity (+U) if there is a common unifying relation R such that all the instances of P are essential wholes under R. A property carries anti-unity (~U) if all its instances can possibly be non-wholes. If every instance of P is an essential whole, but there is no unifying relation common to all instances of P, then we mark P with the property *U

  16. Singularity and Plurality • An individual is a singular whole iff its unifying relation is the transitive closure of the relation "strong connection", like that existing between two 3D regions that have a surface in common. Topological wholes of this kind have a special cognitive relevance, which accounts for the natural language distinction between singular and plural • A plural individual is a sum of singular wholes that is not itself a singular whole. Plural individuals may be wholes themselves or not. In the former case they will be called collections; in the latter case pluralities • A piece of coal is a singular whole. A lump of coal is a topological whole, but not a singular whole, since the pieces of coal merely touch each other, with no material connection. It is therefore a plural whole

  17. Applying Formal Properties • If a property holds necessarily for all the instances of a certain concept, of course its negation cannot hold necessarily for all the instances of a subsumed concept. • Then, if F is a certain formal property, anti-F cannot subsume F: anti-rigidity cannot subsume rigidity, anti-unity cannot subsume unity, and anti-extensionality cannot subsume extensionality. • After labeling every concept in a taxonomy with its formal properties, we can easily check its ontological consistency

  18. OntoClean Components • Formal Criteria • Top-Level Ontology • Ontology of Universals • Domain-Level Development Guidelines • Applications

  19. The OntoCleanTop-Level Ontology: an Overview

  20. Basic Design Guidelines for the OntoClean Top-Level • Introduction of ontological categories lying behind Natural Language and Human Commonsense • Use of formal properties (general and neutral as possible) to characterize the ontological categories • Rigidity, Identity, Dependence, Unity, Extensionality, Singularity, Concreteness (see next slide for references) • Refinement of top-distinctions by further analysis (taking into account philosophy, cognitive sciences, linguistics,…) IMPORTANT: All top-concepts are considered to be rigid, as they are assumed to reflect essential properties of their instances

  21. Brand New Essential Bibliography • Guarino and Welty [2001], • Supporting Ontological Analysis of Taxonomic Relationships (“Data and Knowledge Engineering” - in press) • Identity and Subsumption (In R.Green, C. Bean and S.Myaeng [eds.], The Semantics of Relationships: an Interdisciplinary Perspective. Kluwer - in press) • Gangemi, Guarino, Masolo, Oltramari [2001], • Understanding Top-Level Ontological Distinctions ( Proceedings of IJCAI 2001 workshop on Ontologies and Information Sharing) • Gangemi, Guarino, Oltramari [2001], • Conceptual Analysis of Lexical Taxonomies: The Case of WordNet Top-Level (Proceedings of FOIS 2001)

  22. Aggregate (~D, ~U) Amount of matter (+E) Arbitrary collections Object (~D, *U) Extensional Body (+E) Ordinary Object (~E) Event (+D, +E) Feature (+D, *U, -E) Relevant part Dependent Region body substance, mixture#1, mass#5 universe#1, elementary particle artifact, land#4,(unitary) collection#1 phenomenon, act#2, state#4 edge#3, skin#1, paring#2 opening#10, excavation#3 (1) The Top-Level: Unique Beginners, Direct Hyponyms, Some Synsets from WordNet 1.6

  23. Abstraction (~C) Abstract entity Proposition Set ... Quality space Color space Shape space … Quality (+D,+E,+U) Color Shape ... conclusion#5, lemma#1 union#7, singleton#2 = chromatic color = shape#2 (2) The Top-Level: Unique Beginners, Direct Hyponyms, Some Relevant Synsets from WordNet 1.6

  24. (1) Aggregate vs. Object • What distinguishes an object from an aggregate is that the former is an essential whole, namely it has a unity criterion, while the latter is not. For example, John can “make-up” a snowman (object) starting from the scattered snow (amount of matter) covering his courtyard, adding a hat, a carrot, two deadwoods, etc. In general, amounts of matter are mass-nouns (you can’t say a snow, a water, ...), while objects are count-nouns (such as a snowman, five glasses of water,and so on).

  25. (2) Aggregate vs. Object • Arbitrary collections are just mere sum of wholes which are not themselves essential wholes (as the collection of goods in a bazar). In this sense, they are kinds of aggregate. On the other hand, there are collections which are themselves essential wholes, as a library. In our top-level these unitary collections are to be conceived as a specialization of the object category.

  26. (3) Aggregate vs. Object • An object can change some parts, keeping or not its identity. In the first case, we call it Ordinary Object (~E), in the second case Extensional Object (+E). My car will continue to be the same even if I replace one of its wheels. On the contrary, if I consider the universe, removing a single elementary particle I won’t have the universe any more, but a different entity. • Regarding aggregates, we can say that amounts of matter are clearly +E, while arbitrary collections can be considered as pseudo-extensional (changes in the parts of a member of a collection may be allowed).

  27. Event • Events occur in time. They are assumed to be dependent (+D) on those objects (~D) that are their partecipants. • The penalty kick by Roberto Baggio (main partecipant) • Partecipants are not parts of events. Parts of events can be: • temporal (the first movement of a symphony) • spatial (the strings playing within a symphony) • Parts of events are always essential, which means that events are extensional (+E). Our taxonomy of events needs to be improved and populated. A comparison with EuroWordNet and SIMPLE, in this sense, may be useful.

  28. Feature • Features are “parasitic” (+D) entities, that exist insofar their host exists. Features may be relevant parts of their host, like a bump in a road, or dependent regions, such as a hole in a piece of cheese, the underneath of a table, or the shadow of a tree (which are not parts of their hosts). All features are essential wholes, but no common unity criterion may exist for all of them (*U). Some features can change parts keeping their identity, while some others not: for this reason, we use -E as the common formal property (+E and ~E are both subsumed by -E).

  29. Abstraction • Abstractions are entities that are not concrete, that is, they do not have a physical location (~C). Quality spaces are the first examples of abstractions: time, geometric space, length, color, are all conceptual spaces, with different topological structure. Terms like red, long, sweet, old, recent etc. correspond to regions in a quality space. We can therefore describe the structure of a quality space with a first-order theory, using topological notions: for instance, we can say that “red is adjacent to brown”. Other examples of Abstraction are propositions, sets, symbols, etc.

  30. Quality (1) • Qualities are always “qualities of something”: in this sense, they are dependent (+D). Qualities are individual, i.e. that they are inherent to a unique entity (the color of this rose is red). We call quality-type every homogenous group of individual qualities, such as color, shape, volume, etc. In the OntoClean top-level qualities are structured in strict relationship with quality spaces: every quality-type “corresponds” to a quality space in the branch of Abstractions. A region in a quality space corresponds to an individual quality of an entity in our conceptualization of the world.

  31. Quality (2) • Following our approach, the red of the rose on the right figure is represented as located in a certain region in the colors-quality-space. In the same way, the spherical shape of theball below is represented as located in a certain region of the shape-quality-space. In principle, time and space could be treated as qualities too. We are currently studying the ontological commitments and the formal properties concerning this options.

  32. Appendix: An Alternative View of the OntoClean Top-Level (1) • Since the agreement on the meaning of general categories is not always easy, in this short presentation we preferred to make clear first the most relevant top-level concepts, leaving aside the various ways they can be presented in a hierarchy. For example, we could have introduced the OntoClean top-level by considering the general distinction between concrete and abstract entities as the complete partition of “what there is” in the world. Then, within concrete entities we could have distinguished independent from dependent entities. The overall taxonomy would have been like this:

  33. Quality (+E,+U) Color Shape … Abstraction (~C) Abstract entity Proposition Set ... Quality space Color space Shape space … Concrete (+C) Independent (~D) Aggregate ( ~U) Amount of matter (+E) Arbitrary collections Object ( *U) Extensional Body (+E) Ordinary Object (~E) Dependent (+D) Event ( +E) Feature ( *U, -E) Relevant part Dependent Region Appendix: An Alternative View of the OntoClean Top-Level (2)

  34. OntoClean Components • Formal Criteria • Top-Level Ontology • Ontology of Universals (list of main kinds) • Domain-Level Development Guidelines • Applications

  35. Which part are you talking about? • If my liver is part of my digestive system, and that system is part of me, is my liver part of me? • If my liver is a part of me and I am part of the CNR, is my liver part of the CNR? • My liver is a component of my digestive system, while I am a member of CNR. No rule for composing component and member relations • Moreover, I am a body, but I am also a person. A living person depends on a body. Nevertheless, a living person can be a member of CNR, but a body cannot

  36. Ontology of Universals - Main Relation Kinds - • Intracategorial • Mereological (entity, entity) • Topological (entity, entity) • Intercategorial • Localization (region, entity) • Participation (event, entity) • Representation (*sign, entity) • Entrenched axiomatic characterisation

  37. OntoClean Components • Formal Criteria • Top-Level Ontology • Ontology of Universals • Domain-Level Development Guidelines (hints) • Applications

  38. Kinds of Terminological Ontology Sources • Catalog of normalized terms, e.g. a list of terms used in the reports from a laboratory: no taxonomy, no axioms, and no glosses • Glossed catalog, e.g. a dictionary: a catalog with glosses. • Thesaurus, e.g. many parts of the UMLS Metathesaurus, GEMET: a hierarchical collection of terms; the hierarchical link is usually polysemous • Taxonomy, e.g. the ICD10: a collection of classes with a partial order induced by inclusion (classification) • Axiomatized taxonomy, e.g. the GALEN Core Model: a taxonomy with axioms • Ontology library, e.g. the Ontolingua repository: a set of axiomatized taxonomies with relations among them. Each element of the library is a module, which can be included into another one. Also, a concept from a module can be only used into another one. Ontology modules can be considered subdivisions of the namespace of a model

  39. Impairments in Traditional Terminologies • Lack of hierarchies • Ambiguous hierarchies • Informality • Lack of modularity or cyclic taxonomical dependencies between modules • Polysemy of various sorts • Uncertain semantics • Ontological opaqueness • Lack of a (minimal) set of axioms • 'Remainder' partitions • 'Exception' partitions • Terminological cycles • Meta-level soup (individuals mixed up with universals or even higher order concepts) • Low maintenance capabilities

  40. Ontologies: some desiderata • An explicit taxonomy with subsumption among concepts • Semantic explicitness of relations • Rigorous modularity of namespace • A stratified design of the modules • Absence of polysemy within a module • Disjointness of rigid concepts within a module and within the top-level • A proper interface between the ontology namespace and one or more sets of lexical realizations • Linguistically meaningful naming policy (cognitive transparency) • Rich documentation • Some (minimal) axiomatization to detail the difference among sibling concepts • Explicit linkage to concepts and relations from generic theories • Meta-level assignments to distinguish among the formal primitives assigned to concepts • Languages and implementations that support the previous needs as well as the possibility of collaborative modeling

  41. ONtologic Integration Of Naïve Sources

  42. Lexical and Linguistic Analysis • Morpho-semantic analysis (extraction of sematically meaningful units) • Functional informational structures (extraction of head/modifier syntagmatic structures) • Conceptual polysemy treatment (templates for systematic ambiguity resolving)

  43. Conceptual Issues in Ontology Integration • Ontology integration is – generally speaking – the construction of an ontology C that formally specifies the union of the vocabularies of two other ontologies A and B • To be sure that A and B can be integrated at some level, C has to commit to both A's and B's conceptualizations. In other words, the intension of the concepts in A and B should be mapped to the intension of C's concepts • Unfortunately, this cannot be realized using only the conceptual relations specified in A and B for local tasks (for a specific context). The methodological principle adopted here is that generic ontologies reused from the philosophical, linguistic, mathematical, AI literature must found the comparison of different intensions. Our approach may be called principled conceptualintegration

  44. Ontology Library Architecture

  45. Aspects of integration • Three aspects of an ontology are taken into account: • the intended models of the conceptualizations of its vocabulary • the domain of interest of such models, i.e. the 'topic' of the ontology • the namespace of the ontology • The most interesting case is when A and B are supposed to commit to the conceptualization of the same domain of interest or of two overlapping domains. In particular, A and B may be:

  46. The main steps (I) • 0. Semantically opaque hierarchies and lists are pre-processed in order to create ‘clean’ taxonomies • 1. All concepts, relations, templates, rules, and axioms from a source ontology are represented in the ONIONS formalisms, currently Loom, Ontolingua, and OKBC • 2. When available, plain text descriptions are analyzed and axiomatized (text formalization) • 3. The union of such products is integrated by means of a set of generic ontologies. This is the most characteristic activity in ONIONS, which can be briefly described as follows:

  47. II • 3.1. For any set of sibling concepts in a taxonomy, the conceptual difference between each of them is inferred, and such difference is formalized by axioms that reuse the relations and concepts already in the library. If no concept is available to represent the difference, new concepts are added to the library • 3.2. For any set of polysemous senses of a term, different concepts are stated and placed within the library according to their topic and to the available modules. (Polysemy occurs when two concepts with overlapping or disjoint intended models have the same name.) • 3.3. Often, polysemous senses of a term - as well as different 'alternative' concepts - are metonymically related. For example: process/outcome (as in inflammation), region/object (as in body region), etc. Alternatives must be properly defined by making it explicit the relationship between them: e.g. "has-product" for inflammation, "location" for body-region • 3.4. When stating new concepts, the relations necessary to maintain the consistency with the existing concepts are instantiated. If conflicts arise with existing theories, a more general theory is searched which is more comprehensive. If this is impracticable, an alternative theory is created

  48. III • 3.5. Relevant integration cases. Since ONIONS requires the use of generic theories to axiomatize alternative theories, the integration of a concept C from an ontology O is performed by comparing C with the concepts D1,…,n already present in the evolving ontology library L, whose ontology set M1,…,n contains at least a significant subset of generic ontologies and the set of domain ontologies at that state in the evolution of L. The following cases appear relevant to the methodology: • 3.5.1. C's name is polysemous in O (internal polysemy). Iterate 3.2 ÷ 3.4 • 3.5.2. C's name is homonym with the name of a Di. (Homonymy occurs when both the intended models and the domains of two concepts with the same name are disjoint.) Homonyms must be differentiated by modifying the name, or by preventing the homonyms to be included in the same module namespace • 3.5.3. C's name is synonym with the name of a Di. (Synonymy is the converse of homonymy and occurs when two concepts with different names have both the same intended model and the same domain.) Synonyms must be preserved, or included in the set of lexical realizations related to the concept • 3.5.4. C is subsumed by some Di in L, but it has no total mapping on any Dj in L. The gap in L must be filled by adding C as a subconcept of Di

  49. IV • 3.5.5. C is an intersection between two concepts Di and Djin L. Solved by distinguishing types and roles, or different defining elements • 3.5.6. C has an alternative concept Di in L (same domain, but overlapping or disjoint intended models): • 3.5.6.1. If Cmetonymically depends on Di, C is properly related to Di • 3.5.6.2. If C and Di are different viewpoints on the same domain of interest, both concepts are kept; if the case, they are included in separate modules • 3.5.6.3. If the intended model of C is finer than Di's, Di is substituted with C • 3.5.6.4. If the intended model of C is coarser than Di's, C is ignored (but track of it is kept for mapping between sources)

  50. V • 4. The library of generic, intermediate, and domain ontologies should be stratified, say domain modules should include intermediate modules - that should include generic modules - so that each set of modules can be plugged or unplugged from its more general set without affecting the coherence of the entire library • 5. The source ontologies are explicitly mapped to the integrated ontology, in order to allow interoperability. The only admitted mappings are equivalent and coarser equivalent. Formally: for any source ontology SO and an ontology IO that is supposed to result (also)from the integration of SO, for any concept Ci in SO, there is a Di in IO such that CiI= DiI (equivalence of possible interpretations), or there is a disjunctive concept (orDi Dj) in IO such that CiI= DiIDjI(equivalence of possible interpretations to a disjunction of concepts – i.e. to a union of finer concepts) • 5.1. Partial mappings must have been already resolved through the methodology: if any, some step in the integration procedure must be iterated

More Related