270 likes | 388 Views
Ontology Engineering with OntoClean. Chris Welty IBM Watson Research Center. People Nicola Guarino Cladio Masolo Aldo Gangemi Alessandro Oltramari Bill Andersen. Organizations IBM Research Vassar College, USA LADSEB-CNR, Padova CNR Cognitive Science Institute, Trento
E N D
Ontology Engineering with OntoClean Chris Welty IBM Watson Research Center
People Nicola Guarino Cladio Masolo Aldo Gangemi Alessandro Oltramari Bill Andersen Organizations IBM Research Vassar College, USA LADSEB-CNR, Padova CNR Cognitive Science Institute, Trento OntologyWorks, Inc. Acknowledgements
Which one is better? ThinkPad Model ThinkPad T Series model T-Series Thinkpad
Computer Part Disk Drive Memory Which one is better? Computer Computer has-part has-part Disk Drive Memory Computer Part Disk Part Memory Part Due to: Guizzardi, et al, 2004.
Formal Ontology of Relations • Subsumption • Instantiation • Part/Whole • Constitution • Spatial (Cohn) • Temporal (Allen)
Subsumption • The most pervasive relationship in ontologies • Influence of taxonomies and OO • AKA: Is-a, a-kind-of, specialization-of, subclass (Brachman, 1983) • “horse is a mammal” • Capitalizes on general knowledge • Helps deal with complexity, structure • Reduces requirement to acquire and represent redundant specifics • What does it mean? □ x f(x) r(x) Every instance of the subclass is necessarily an instance of the superclass
Overloading Subsumption Common modeling pitfalls • Instantiation • Constitution • Composition • Disjunction • Polysemy • Temporality • Spatial/Containment
Instantiation Pitfall Does this ontology mean that My ThinkPadis aThinkPad Model? ThinkPad Model T21 Ooops… My ThinkPad (s# xx123) Question: What ThinkPad models do you sell? Answer should NOT include My ThinkPad -- nor yours.
model Instantiation Notebook Computer ThinkPad Model T Series T 21 My ThinkPad (s# xx123)
Composition Pitfall Computer Disk Drive Memory Micro Drive Question: What Computers do you sell? Answer should NOT include Disk Drives or Memory.
Composition Computer part-of Disk Drive Memory Micro Drive
Disjunction Pitfall has-part Computer Computer Part Disk Drive Memory Micro Drive has-part Flashcard-110 Camera-15 Unintended model: flashcard-110 is a computer-part
has-part Disk Drive Memory … Computer Disjunction
Polysemy Pitfall(Mikrokosmos) Physical Object Abstract Entity Book ….. Question: How many books do you have on Hemingway? Answer: 5,000
Polysemy(WordNet) Physical Object Abstract Entity Book Sense 1 Book Sense 2 Biography of Hemingway …..
Constitution Pitfall(WordNet) Entity Amount of Matter Physical Object Clay Metal Computer Question: What types of matter will conduct electricity? Answer should NOT include computers.
Constitution Entity Physical Object Amount of Matter constituted Computer Metal Clay
Temporality Pitfall(Wikipedia) 1960s 1964 1963 Chris
Temporality Pitfall(Wikipedia) 1960s births 1964 births 1963 births Chris
Temporality Decade 1960s contains 1964 1963 bornIn Chris Year Person
Spatial/Containment Pitfall(OWL Guide) French Region Loire Region Alsace Region
Spatial/Containment Country France contains Loire Alsace Region
Its about the instances • For every class, think about what an instance of it is • What is an instance of “Loire Region”? • Classes do not describe their subclasses • “Regions by Country” is a class of classes • Criteria for individuation must remain constant within a taxonomy • Instance of a class is also an instance of every superclass • Thus “Chris” is not an instance of “1963 births” • Explore the “boundary conditions” • E.g. Changes in existence, distinctions with similar classes • “Leaf Nodes” of a hierarchy have no special significance • Don’t switch to instances
Composition (part of) Arm subclass body Constitution Statue subclass marble Disjunction (class Car partial (all hasPart CarPart) (Engine subclass CarPart) (Tire subclass CarPart) Spatial NewYork subclass US Polysemy Book subClass PhysicalObject Book subClass ConceptualCreation Arbitrary organizational nodes FictionalBookbyLatinAmericanAuthor subClass FictionalBook Instance PinotNoir instanceof Grape Temporality YoungElvis instanceOf Elvis Common Pitfalls
The linguistic tests • If P subclass Q, you should be able to say “P is a kind of Q” • If a instanceOf P, you should be able to say, “a is a P” • If a instanceof P subClassOf Q, you should be able to say “a is a Q” • For every instance, there should be a class it is (rigidly) an instance of that is its natural label • You should not find it natural to say, if P subclassOf Q, “P has Q”, “P might be Q”, “P was Q”, “P is in Q”, “P is part of Q”
What’s in a name • Don’t argue about what specific terms mean • Common software architecture argument: “What is a bridge?” • Try and find the distinctions that matter • Assign them labels later • Avoid “ish” “-thing” & “other-” classes • Find good names that will avoid meaning creep • Other- classes create a maintenance nightmare • Classes describe their instances • Remember the linguistic tests • The superclass is not part of the name • So don’t assume it is (e.g. Best_Practices subClassOf Document)