1 / 109

How to build an ontology 2

How to build an ontology 2. Barry Smith http://ontology.buffalo.edu/smith. The 3-level Distinction. Level 1: everything that exists (things, processes, data …) ; Level 2: ideas in people’s minds (diagnoses, thoughts, images in your head, expectations, beliefs, fears …) Level 3:

ali-gentry
Download Presentation

How to build an ontology 2

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. How to build an ontology 2 • Barry Smith • http://ontology.buffalo.edu/smith

  2. The 3-level Distinction • Level 1: • everything that exists (things, processes, data …); • Level 2: • ideas in people’s minds (diagnoses, thoughts, images in your head, expectations, beliefs, fears …) • Level 3: • publicly available (published, written down, drawn, recorded, saved) versions of level 2 entities (ontologies, databases, journal articles, newspaper reports, diaries …)

  3. The 3-level Distinction • Level 1: • #120: an incident that happened; • Level 2: • #213: the interpretation by some cognitive agent that #120is an security breach; • #31: the expectation by some cognitive agent that similar incidents might happen in the future; • Level 3: • #402: an entry in and information system concerning #120; • #1503: an entry in some other information system about #31 for mitigation or prevention purposes.

  4. How do we know which general terms designate universals? • Roughly: terms used by scientists to designate entities about which we have a plurality of different kinds of testable proposition • (cell, electron ...)

  5. More precisely: terms which designate universals are: • General • Used in current scientific textbooks to express laws of nature • Logically non-compound (‘non-rabbit’, ‘rabbit or violin’ do not designate universals) • Contain no parts designating particulars (‘cat in Leipzig’, ‘Finnish spy’ do not designate universals

  6. Class =def • a maximal collection of particulars determined by a general term • (‘cell’. ‘electron’ but also: ‘ ‘restaurant in Palo Alto’, ‘Italian’) • the class A • = the collection of all particulars x for which ‘x is A’is true

  7. universals vs. their extensions • universals • {a,b,c,...} collections of particulars

  8. Extension =def • The extension of a universal A is the class: instance of the universal A • (it is the class of A’s instances) • (the class of all entities to which the term ‘A’ applies)

  9. Problem • The same general term can be used to refer both to universals and to collections of particulars. Consider: • HIV is an infectious retrovirus • HIV is spreading very rapidly through Asia

  10. universals vs. classes • universals • {c,d,e,...} classes

  11. universals vs. classes • universals • defined classes

  12. universals vs. classes • universals • populations, ...

  13. Defined class =def • a class defined by a general term which does not designate a universal • the class of all diabetic patients in Leipzig on 4 June 1952

  14. OWL is a good representation of defined classes • sibling of Finnish spy • member of Abba aged > 50 years

  15. Terminology =def. • a representational artifact whose representational units are natural language terms (with IDs, synonyms, comments, etc.) which are intended to designate universals together with defined classes.

  16. ? universals, classes, concepts • universals • defined classes • ‘concepts’

  17. universals < defined classes < ‘concepts’ • ‘concepts’ which do not correspond to defined classes: • ‘Surgical or other procedure not carried out because of patient's decision’ • ‘Congenital absent nipple’ • because they do not correspond to anything

  18. (Scientific) Ontology =def. • a representational artifact whose representational units (which may be drawn from a natural or from some formalized language) are intended to represent • 1. universals in reality • 2. those relations between these universals which obtain universally (= for all instances) • lung is_a anatomical structure • lobe of lung part_of lung

  19. Part II: How to Build an Ontology

  20. How to build an ontology • work with scientists to create an initial top-level classification • find ~50 most commonly used terms corresponding to universals in reality • arrange these terms into an informal is_a hierarchy according to this Universality principle • A is_a B  every instance of A is an instance of B • fill in missing terms to give a complete hierarchy • (leave it to domain scientists to populate the lower levels of the hierarchy)

  21. Principle of Low Hanging Fruit • Include even absolutely trivial assertions (assertions you know to be universally true) • pneumococcal virus is_a virus • Computers need to be led by the hand

  22. Goal: Each term in an ontology represents exactly one universal • there are universals also of collectivities: • population • complex of cells

  23. the use-mention confusion • swimming is healthy and has eight letters

  24. Principle • Avoid confusing between words and things • Avoid confusing between concepts in our minds and entities in reality • Recommendation: avoid the word ‘concept’ entirely

  25. Principle • For the sake of interoperability with other ontologies, do not give special meanings to terms with established general meanings • (Don’t use ‘cell’ when you mean ‘plant cell’)

  26. Principle • Supply definitions wherever possible • (both human-understandable natural language definitions, and equivalent formal definitions)

  27. Principle • Each term should have at most one definition • which may have both natural-language and formal versions

  28. The Problem of Circularity • A Person = def. A person with an identity document • cell = def. plant cell, consisting of protoplast and cell wall; ...

  29. Principle • Avoid circular definitions • (The term defined should not appear in its own definition)

  30. Principle • A definition should use terms which are easier to understand than the term defined

  31. Principle • Use Aristotelian definitions • An A is a B which C’s. • A human being is an animal which is rational

  32. Principle • Do not seek to define everything

  33. In every ontology • some terms and some relations are primitive = they cannot be defined (on pain of infinite regress) • Examples of primitive relations: • identity • instance_of

  34. Rules for formatting terms • Avoid abbreviations even when it is clear in context what they mean (‘breast’ for ‘breast tumor’) • Avoid acronyms • Avoid mass terms (‘tissue’, ‘brain mapping’, ‘clinical research’ ...) • Treat each term ‘A’ in an ontology is shorthand for a term of the form ‘the universal A’

  35. Univocity • Terms should have the same meanings on every occasion of use. • (= They should refer to the same universals) • Basic ontological relations such as is_a and part_of should be used in the same way by all ontologies

  36. Universality • Ontologies are made of relational assertions • They should include only those which hold universally • pneumococcal virus causes pneumonia

  37. Universality • Often, order will matter: • We can assert • adult transformation_of child • but not • child transforms_into adult

  38. Universality • viral pneumonia caused by virus • but not • virus causes pneumonia • pneumococcal virus causes pneumonia

  39. Universality • results analysis later_than protocol-design • BUT NOT • protocol-design earlier_than results analysis

  40. Positivity • Complements of universals are not themselves universals. • Terms such as • non-mammal • non-membrane • other metalworker in New Zealand • do not designate universals in reality

  41. Positivity • What about non-smoker?

  42. Objectivity • Which universals exist in reality is not a function of our knowledge. • Terms such as • unknown • unclassified • unlocalized • arthropathies not otherwise specified • do not designate universals in reality.

  43. Keep Epistemology Separate from Ontology • If you want to say that • We do not know where A’sare located • do not invent a new class of • A’s with unknown locations • (A well-constructed ontology should grow linearly; it should not need to delete classes or relations because of increases in knowledge)

  44. Keep Sentences Separate from Terms • If you want to say • I surmise that this is a case of pneumonia • do not invent a new class of surmised pneumonias • Confusion of ‘findings’ in medical terminologies

  45. Single Inheritance • No kind in a classificatory hierarchy should have more than one is_a parent on the immediate higher level

  46. Multiple Inheritance • thing • car • blue thing • is_a • is_a • blue car

  47. Multiple Inheritance • is a source of errors • encourages laziness • serves as obstacle to integration with neighboring ontologies • hampers use of Aristotelian methodology for defining terms • hampers use of statistical search tools

  48. Multiple Inheritance • thing • blue thing • car • is_a1 • is_a2 • blue car

  49. is_a Overloading • The success of ontology alignment demands that ontological relations (is_a, part_of, ...) have the same meanings in the different ontologies to be aligned.

More Related