560 likes | 821 Views
Ontologies. Piek Vossen VU University Amsterdam. Overview. Ontologies versus lexicons Ontological starting points Comparison of available ontologies Identity criteria Basic Formal Ontology. Why ontologies?. Lexicons of the future will depend on ontologies;
E N D
Ontologies Piek Vossen VU University Amsterdam
Overview • Ontologies versus lexicons • Ontological starting points • Comparison of available ontologies • Identity criteria • Basic Formal Ontology
Why ontologies? • Lexicons of the future will depend on ontologies; • Semantic data in lexicon partially reflects world knowledge; • World knowledge is stored externally in for example the Open Data Cloud: network of RDF data resources • Lexicons contain linguistic knowledge that is not in encyclopedia
World knowledge in Wordnet • POS: v ID: ENG20-02177556-v BCS: 1Synonyms: sell:1Definition: exchange or deliver for money or its equivalentDomain: commerceSUMO/MILO: Selling -> [hypernym] exchange:1, change:7, interchange:1 transfer:5 • POS: v ID: ENG20-02143689-v BCS: 2Synonyms: buy:1, purchase:1Definition: obtain by purchase; acquire by means of a financial transactionDomain: commerceSUMO/MILO: Buying-> [hypernym] get:1, acquire:1
SUMO • Selling • (documentationSellingEnglishLanguage "A FinancialTransaction in which an instance of Physical is exchanged for an instance of CurrencyMeasure.") • Buying • (documentationBuyingEnglishLanguage "A FinancialTransaction in which an instance of CurrencyMeasure is exchanged for an instance of Physical.") • FinancialTransaction • (documentationFinancialTransactionEnglishLanguage "A Transaction where an instance of Currency is exchanged for something else.")
Lexicon ontology mapping Lexicon: sell: subj(x), direct obj(z),indirect obj(y) buy: subj(y), direct obj(z),indirect obj(x) Ontology: (and (instance x Human)(instance y Human) (instance z Entity) (instance e FinancialTransaction) (source x e) (destination y e) (patient z e) The same process but a different perspective by subject and object realization: marry in Russian two verbs, apprendre in French can mean teach and learn
Linking Open Data http://richard.cyganiak.de/2007/10/lod/
social computer & human networks social computer networks RDF databases RDF databases RDF databases RDF databases social networks ......... web web web web web Knowledge pyramid GOOGLE INDEX
Ontologies versus Lexicons • Lexicon contain the knowledge about words and expressions that are necessary to effectively communicate in a language; • Lexicon interacts with grammar and discourse model; • Lexical knowledge is part of general knowledge of the world; • Lexical knowledge is subconscious knowledge (like playing piano) whereas our knowledge of the world is of a higher level (like theory of harmony);
Ontologies versus lexicons • Language is an instrument for communication: • utterances are never completely descriptive • Minimal & sufficient information for a communicative effect (Gricean maxims)
News paper headings & captionsVrij Nederland “Geknipt voor u” • Veel vrouwen verdienen minimumloon • Herder bijt schaap • Zwembad loopt leeg • Dames lopen uit • Winkelende vrouw raakt geld kwijt • Dode zwemmer • Vrouw draagt kruis paus • Eieren gooien terug op braderie
Ontologies versus lexicons • Speakers/writers make assumptions about the addressee: • Knowledge of the world (Schank ('70): grammar does not exist, conceptual dependencies) • Knowledge of language • Knowledge about the communicative settings
Ontologies versus lexicon • Multilingual perspective sheds light on the delineation of lexical and world knowledge: • water = substance & mass noun • sand = substance & mass noun but granular • grass = substance & mass noun but granular • rice, bran (Dutch plural: zemelen), chives (Dutch uncount: bieslook) = substance? & mass noun or plural, oats (Dutch haver, havervlokken, havermeel) • forest = group noun, one, two forests (Dutch bos = group and mass, een, twee bossen, veel bos) • Linguistic variation around border cases: • limited forms -> symbolic • infinite & analogue reality
object artifact, artefact (a man-made object) natural object (an object occurring naturally) block instrumentality body box spoon bag device implement container tool instrument Autonomous & Language-Specific Wordnet1.5 Dutch Wordnet voorwerp {object} blok {block} lichaam {body} werktuig{tool} bak {box} lepel {spoon} tas {bag}
Linguistic versus Artificial Ontologies • Artificial ontology: • better control or performance, or a more compact and coherent structure. • introduce artificial levels for concepts which are not lexicalized in a language (e.g. instrumentality, hand tool), • neglect levels which are lexicalized but not relevant for the purpose of the ontology (e.g. tableware, silverware, merchandise). • What properties can we infer for spoons? • spoon -> container; artifact; hand tool; object; made of metal or plastic; for eating, pouring or cooking
Linguistic versus Artificial Ontologies Linguistic ontology: • Exactly reflects the relations between all the lexicalized words and expressions in a language. • Captures valuable information about the lexical capacity of languages: what is the available fund of words and expressions in a language. What words can be used to name spoons? spoon -> object, tableware, silverware, merchandise, cutlery,
Wordnets versus ontologies • Wordnets: • autonomous language-specific lexicalization patterns in a relational network. • Usage: to predict substitution in text for information retrieval, • text generation, machine translation, word-sense-disambiguation. • Ontologies: • data structure with formally defined concepts. • Usage: making semantic inferences.
Ontological starting points • What is being defined: realists versus conceptualists • scientific definition of the world • cognitive, cultural perception and interpretation • How much room for different perspectives? • Engineering point of view: what is required by applications? • Top level ontologies versus domain ontologies • Principles for ontology design • Sharing, re-use, interoperability
Comparing available ontologies • Mascardi, Cordì, and Rosso (2008) • 7 different Upper Ontologies: BFO, Cyc, DOLCE, GFO, PROTON, Sowa’s ontology, and SUMO, • software engineering criteria: • Number of Dimensions. • Implementation language(s) • Modularity. • Use in Applications. • Alignment with WordNet. • Licensing.
Ontoclean Guarino - Welty • Methodology for designing and building ontologies that ease re-use and integration • Intuitions on how we, as cognitive agents, interact with the world (sensory system, cognition & culture) • Purpose to design ontologies for information systems
Basic Notions • Identity through an essential (intrinsic) property, e.g. DNA, a person’s brain • What properties can change while maintaining identity • Other ways of establishing identity: • Being a member of a class: does not keep the invidividual members apart • Global unique Ids: hacks that does not explain how two descriptions can be the same
Identity criteria (Guarino and Welty) • Rigidity: to what extent are properties of an entity true in all or most worlds? E.g., a man is always a person but may bear a Role like student only temporarily. Thus manhood is a rigid property while studenthood is anti-rigid • Essence: which properties of entities are essential? For example, “shape” is an essential property of “vase” but not an essential property of the clay it is made of. • Unicity: which entities represent a whole and which entities are parts of these wholes? An “ocean” or “river” represents a whole but the “water” it contains does not.
Individuals and Concepts • The term "meta-property" adopted here is based on a fundamental distinction within the domain of discourse: • individuals or particulars vs. • concepts or universals • Meta-level properties induce distinctions among concepts, while object-level properties induce distinctions among individuals
Rigidity • A property is essential to an individual iff it necessarily holds for that individual • A property is rigid (+R) iff, necessarily, it is essential to all its instances. A property is non-rigid (-R) iff it is not essential to some of its instances, and anti-rigid (~R) iff it is not essential to all its instances • Person vs Student
Identity • A property carries an identity criterion (+I) iff all its instances can be (re)identified by means of a suitable sameness relation. A property supplies an identity criterion iff such criterion is not inherited by any subsuming property • Person vs. Student
Dependence • An individualx is constantly dependent on y iff, at any time, x can't be present unless y is fully present, and y is not part of x. Ex: Hole/Host • A property P is constantly dependent (+D) iff, for all its instances, there exists something they are constantly dependent on. • Here Dependent = Constantly Dependent
Types vs. Roles • A rigid property that supplies an identity criterion and is not (notionally) dependent is called a type. • An anti-rigid property that is notionally dependent is called a role. It is a material role if it carries (but not supplies) an identity criterion, and a formal role otherwise. • Person vs. Student vs. Part
Typology of meta properties O = carries its own identity I = carries a identity condition, possibly inherited
Typology of meta properties entity, location Category: -I,+R Formal Property -I red, male Attribute: -I,-R,-D part, patient property Formal role:-I,~R,+D Role ~R,+D student, food Material role:+I,+D,~R Anti- Essential ~R Phase sortal:+I,-D,~R Non- Essential -R caterpilar Sortal +I Type&Attribute:+I,-D,-R red apple Essential ~R Type:+I,+R apple, person Merely essential sortal:+I+R non = not essential to some anti = not essential to all invertebrate mammals
Extensionality • An individual is said to be extensional iff, necessarily, everything that has the same proper parts is identical to it: amount of matter • A property is extensional (+E) iff, necessarily, all its instances are extensional • A property is anti-extensional (~E) iff, necessarily, all its instances are non-extensional, so that they can possibly change some parts while keeping their identity: persons and their bodies
Unity • An individual is unified by a (suitably constrained) relation R iff it is a mereological sum of entities that are bound together by R. Ex. the relation having the same boss may unify a group of employees in a company -> establishes a group • An individual w is a whole under R iff it is maximally unified by R, in the sense that R is internal to w, and no part of w is linked by R to something that is not part or w • A property P is said to carry unity (+U) if there is a common unifying relation R such that all the instances of P are essential wholes under R. A property carries anti-unity (~U) if all its instances can possibly be non-wholes. If every instance of P is an essential whole, but there is no unifying relation common to all instances of P, then we mark P with the property *U
Singularity and Plurality • An individual is a singular whole iff its unifying relation is the transitive closure of the relation "strong connection", like that existing between two 3D regions that have a surface in common. Topological wholes of this kind have a special cognitive relevance, which accounts for the natural language distinction between singular and plural -> countibility • A plural individual is a sum of singular wholes that is not itself a singular whole. Plural individuals may be wholes themselves or not. In the former case they will be called collections; in the latter case pluralities • A piece of coal is a singular whole. A lump of coal is a topological whole, but not a singular whole, since the pieces of coal merely touch each other, with no material connection. It is therefore a plural whole
Messy taxonomy entity:-I-U-D+R Group Red Agent Location Amount of matter Group of people Physical Object Social entity Living being Food Legal entity Fruit Animal Organization Apple Vertebrate Country Caterpillar Red Apple Butterfly Person
Methodology • Analyse each property according to meta-properties • Remove all properties except for categories and essential sortals • Remove subsumption between incompatible identity conditions • Add Phasal sortals • Add attributes, roles and mixed types
Some conflicts • car -> physical object + amount of matter • animal -> living being + physical object • organization -> group of people (+ME) + social entity (-ME) + legal agent
Cleaner taxonomy entity:-I-U-D+R Group +O~U-D+R Location +O-U-D+R Amount of matter +O-U-D+R Group of people:+I Living being +O+U-D+R Social entity -I+U-D+R Physical Object +O+U-D+R Animal Organization +O+U-D+R Fruit Vertebrate:+I Apple Person
Clean taxonomy entity:-I-U-D+R Group Group of people Location Amount of matter Red +i-o~u-d+r Agent Social entity Physical Object Living being +o+u -d+r Food Legal entity +i-o~u+d~r Fruit Country Animal +o+u-d+r Apple Lepidopteran Organization Vertebrate Region +o+u-d+r Red Apple +o-u-d+r Person +i+o+u-d-r Caterpillar Butterfly +o+u-d+r +l+u-d~r +l+u-d~r
Basic Formal Ontology • Realist approach to ontology, based on science: • independent of our linguistic, comceptual, theoretical, cultural representations • reality existed before humans • Perspectivalism: • there are many different representations that are equally good: -> different levels of granularity (atoms, molecules, organisms, ecosystems, galaxies) • Fallibilism: science can be wrong • Adequate: given the domain choose the adequate granularity
t i m e process Substances and processesexist in time in different ways substance
t i m e process Snapshot Video ontology ontology substance
SNAP vs SPAN • Objects vs. events • Continuants vs. occurrents • Nouns vs. verbs • In preparing an inventory of reality • we keep track of these two different kinds of entities in two different ways
SNAP and SPAN • anatomy and physiology