270 likes | 370 Views
Toward Using Ontologies to Reason About Disagreeing Taxonomic Experts. Dave Thau UC Davis thau@learningsite.com. Why Did The Chicken Cross The Road?. To get to the other side. To boldly go where no chicken has gone before. To prove it could never reach the other side.
E N D
Toward Using Ontologies to Reason About Disagreeing Taxonomic Experts Dave Thau UC Davis thau@learningsite.com
Why Did The Chicken Cross The Road? • To get to the other side. • To boldly go where no chicken has gone before. • To prove it could never reach the other side. • Chickens, over great periods of time, have been naturally selected so that they are now predisposed to cross roads. Zeno of Elea NeSC RDF Workshop June 8, 2006
Why did the taxonomists cross the road? So they could properly identify the chicken NeSC RDF Workshop June 8, 2006
Overview • Quick primer on taxonomy • Some types of disagreements between experts • Problems this causes • Using an ontology to represent taxonomic opinions • Using the ontology to compare experts’ theories NeSC RDF Workshop June 8, 2006
Linnaean Taxonomy Basics Ranks: kingdom, phylum, class, order, family, genus, species, variety (and others!) Canidae Family Rank Vulpes Canis Nyctereutes Genus Rank Canis lupus Canis latrans Canis familiaris Species Rank NeSC RDF Workshop June 8, 2006
Things you may not know • There is no big list of all the known species in the world • This is partly because people don’t agree on the definitions of the species, genera, etc. • Estimates are that 6% of the known taxa are changed every year • This has been going on since Linnaeus published his classification scheme in 1735 NeSC RDF Workshop June 8, 2006
A B A B B A B A A overlap B A disjoint B A B A B A B Types of Disagreement: The Basics Benson, 1948 FNA-03, 1997 Ranunculus aquatilis Ranunculus aquatilis º R.a. var calvescens R.a. var capillaceus R.a. var aquatilis R.a. var diffusus R.a. var hispidulus º º This results in 512 (more than 240 million) possible sets of relationships. NeSC RDF Workshop June 8, 2006
º Types of Disagreement -Splitting and Lumping Kartesz, 2004 Benson, 1948 Ranunculus flammula Ranunculus flammula R.f. var filiformis R.f. var genuiinus R.f. var ovalis R.f. var filiformis R.f. var flammula º Peet, 2005: B.1948:R.flammula is congruent to K.2004:R.flammula B.1948:R.f. genuiinus is included in K.2004:R.f.flammula B.1948:R.f.ovalis is included in K.2004:R.flammula B.1948:R.f.filifomis is congruent to K.2004:R.f.filiformis NeSC RDF Workshop June 8, 2006
Types of Disagreement – Differing Extents Benson, 1948 Kartesz, 2004 Ranunculus glaberrimus Ranunculus glaberrimus R.g. var reconditus R.g. var ellipticus R.g. var typicus R.g. var ellipticus R.g. var glaberrimus º º Peet, 2005: B.1948:R. glaberriums contains K.2004:R. glaberrimus B.1948:R.g.ellipticus is congruent to K.2004:R.g.ellipticus B.1948:R.g.typicus is congruent to K.2004:R.h.blaberrimus B.1948:R.g.reconditus is congruent to K.2004:R.tritenatus NeSC RDF Workshop June 8, 2006
Impact on Data Analysis • Can’t find data • If A º B, a search on A should retrieve B • Can’t aggregate data • If B A, you should be able to combine data from B into A NeSC RDF Workshop June 8, 2006
What to do in case of conflicting experts? • Just listen to one expert you like • Pick an expert you like and everyone who agrees with this expert (and each other) • Choose experts who form the largest set of agreeing experts • Choose experts whose opinions encompass the smallest or largest number of taxa NeSC RDF Workshop June 8, 2006
How can we find out which experts agree? • Represent taxonomy using logic • Use the logic to determine relations between expert opinions (theories) • Two theories may conflict • Two theories may be equivalent • One theory may encompass another NeSC RDF Workshop June 8, 2006
Representation Details • Based on the Taxon Concept Schema (TCS) • Represented using Description Logic • (OWL DL) NeSC RDF Workshop June 8, 2006
Taxon Description Taxon Example Ontology Things in the species Ranunculus glaberrimus hasSpecies hasGenus Ranunculus glaberrimus (Kartesz, 2004) Things in the genus Ranunculus Ranunculus (Kartesz, 2004) Specimen NeSC RDF Workshop June 8, 2006
Fundamental Assumptions • Each Taxa class has at least one instance • Each Taxa class is defined as the union of its subclasses • A class’s subclasses are defined to be mutually disjoint NeSC RDF Workshop June 8, 2006
Questions Ontology Can Answer • Find the subclasses of a class • Make sure the taxonomy is consistent • See if two classes are equivalent • Can also use it to compare expert opinions NeSC RDF Workshop June 8, 2006
Compatible Theories • A theory is one expert’s set of classes and relations and all they imply. • A set of theories is compatible if • Each theory is consistent and • The correspondences between classes in the theories do not cause inconsistency. NeSC RDF Workshop June 8, 2006
Example Incompatibility Benson, 1948 Kartesz, 2004 Ranunculus hydrocharoides Ranunculus hydrocharoides º R.h. var natans R.h. var stolonifer R.h. var typicus R.h. var stolonifer R.h. var typicus º º Peet, 2005: B.1948:R.h.stolonifer is congruent to K.2004:R.h.stolonifer B.1948:R.h.typicus is congruent to K.2004:R.h.typicus B.1948:R. hydrocharoides is congruent to K.2004:R. hydrocharoides NeSC RDF Workshop June 8, 2006
Example Incompatibility Benson, 1948 Kartesz, 2004 Ranunculus Ranunculus Ranunculus macranthus Ranunculus petiolaris Ranunculus petiolaris … … B.48:R. petiolaris K.04:R. petiolaris B.48:R. macranthus contradicts B.48:R. macranthus and B.48:R. petiolaris are disjoint. Peet, 2005: B.1948:R. macranthus contains K.2004: R. petiolaris B.1948:R. petiolaris is contained by K. petiolaris NeSC RDF Workshop June 8, 2006
Inferring Unstated Correspondences Benson, 1948 Kartesz, 2004 Ranunculus arizonicus Ranunculus arizonicus º R.a. var chihuahua R.a. var typicus Peet, 2005: B.1948:R.a.typicus is included in K.2004:R. arizonicus B.1948:R. arizonicus is congruent to K.2004:R. arizonicus NeSC RDF Workshop June 8, 2006
Comparing Theories • Given two compatible theories, T and T’: • The theories are equivalent if each class in theory T is equivalent to one class in T’ (and vice versa). • T is smaller than ( ) T’ if each class in T either equals or is contained by a class in T’. NeSC RDF Workshop June 8, 2006
Example of Theory Ordering T1 T2 T3 A A A º B C D B C B C E º º º T1 T2 T3 NeSC RDF Workshop June 8, 2006
Whom to believe? • Just listen to one expert you like • Easy! Don’t need any reasoning • Pick an expert you like and everyone who can agree with this expert • Choose all experts with theories equivalent to the expert you like • Choose experts who form the largest set of agreeing experts • Find largest equivalence class • Choose experts whose opinions form the smallest or largest number of taxa • Bigger theories account for more taxa NeSC RDF Workshop June 8, 2006
Future Work • Vetting the ontology • Adding ‘intelligence’ to tools which build correspondences • Implementing authority picker in a workflow system • Efficient algorithm for determining theory hierarchy NeSC RDF Workshop June 8, 2006
Thanks! Questions? • I’d like to acknowledge: • Bertram Ludäscher, Shawn Bowers, Serguei Krivov, Richard Waldinger for many discussions on this topic. • Jessie Kennedy, Robert Kukla, Trevor Patterson, Martin Graham for their work on the Taxon Concept Schema • Bob Peet for the Ranunculus data set • Kirsten Menger-Anderson for Chicken Drawing • NSF, under SEEK awards 0225676, 0225665, 0225635, and 0533368 NeSC RDF Workshop June 8, 2006
Where In Greece Can I Find Ranunculus aquatilis? R. aquatilis R. trichophyllus NeSC RDF Workshop June 8, 2006
Beginnings of Biological Taxonomy • Egypt, 1500 BC: Ebers medical papyrus, classification of medical plants • Greece, 300 BC: Aristotle and Theophrastus • China, 200 BC: Erh-ya dictionary (second century BC) NeSC RDF Workshop June 8, 2006