890 likes | 1.07k Views
Towards Common Upper Ontology Barry Smith http://ontology.buffalo.edu/smith September 25, 2009. Overview. The Rise of Applied Ontology The OBO Foundry Basic Formal Ontology How to Build an Ontology What is a Disease?. Overview. The Rise of Applied Ontology The OBO Foundry
E N D
Towards Common Upper Ontology Barry Smith http://ontology.buffalo.edu/smith September 25, 2009
Overview • The Rise of Applied Ontology • The OBO Foundry • Basic Formal Ontology • How to Build an Ontology • What is a Disease?
Overview • The Rise of Applied Ontology • The OBO Foundry • Basic Formal Ontology • How to Build an Ontology • What is a Disease?
2260 • 2968 • 3236 year number of abstracts
How to do biology across the genome? • MKVSDRRKFEKANFDEFESALNNKNDLVHCPSITLFESIPTEVRSFYEDEKSGLIKVVKFRTGAMDRKRSFEKVVISVMVGKNVKKFLTFVEDEPDFQGGPISKYLIPKKINLMVYTLFQVHTLKFNRKDYDTLSLFYLNRGYYNELSFRVLERCHEIASARPNDSSTMRTFTDFVSGAPIVRSLQKSTIRKYGYNLAPYMFLLLHVDELSIFSAYQASLPGEKKVDTERLKRDLCPRKPIEIKYFSQICNDMMNKKDRLGDILHIILRACALNFGAGPRGGAGDEEDRSITNEEPIIPSVDEHGLKVCKLRSPNTPRRLRKTLDAVKALLVSSCACTARDLDIFDDNNGVAMWKWIKILYHEVAQETTLKDSYRITLVPSSDGISLLAFAGPQRNVYVDDTTRRIQLYTDYNKNGSSEPRLKTLDGLTSDYVFYFVTVLRQMQICALGNSYDAFNHDPWMDVVGFEDPNQVTNRDISRIVLYSYMFLNTAKGCLVEYATFRQYMRELPKNAPQKLNFREMRQGLIALGRHCVGSRFETDLYESATSELMANHSVQTGRNIYGVDFSLTSVSGTTATLLQERASERWIQWLGLESDYHCSFSSTRNAEDV
MKVSDRRKFEKANFDEFESALNNKNDLVHCPSITLFESIPTEVRSFYEDEKSGLIKVVKFRTGAMDRKRSFEKVVISVMVGKNVKKFLTFVEDEPDFQGGPIPSKYLIPKKINLMVYTLFQVHTLKFNRKDYDTLSLFYLNRGYYNELSFRVLERCHEIASARPNDSSTMRTFTDFVSGAPIVRSLQKSTIRKYGYNLAPYMFLLLHVDELSIFSAYQASLPGEKKVDTERLKRDLCPRKPIEIKYFSQICNDMMNKKDRLGDILHIILRACALNFGAGPRGGAGDEEDRSITNEEPIIPSVDEHGLKVCKLRSPNTPRRLRKTLDAVKALLVSSCACTARDLDIFDDNNGVAMWKWIKILYHEVAQETTLKDSYRITLVPSSDGISLLAFAGPQRNVYVDDTTRRIQLYTDYNKNGSSEPRLKTLDGLTSDYVFYFVTVLRQMQICALGNSYDAFNHDPWMDVVGFEDPNQVTNRDISRIVLYSYMFLNTAKGCLVEYATFRQYMRELPKNAPQKLNFREMRQGLIALGRHCVGSRFETDLYESATSELMANHSVQTGRNIYGVDSFSLTSVSGTTATLLQERASERWIQWLGLESDYHCSFSSTRNAEDVVAGEAASSNHHQKISRVTRKRPREPKSTNDILVAGQKLFGSSFEFRDLHQLRLCYEIYMADTPSVAVQAPPGYGKTELFHLPLIALASKGDVEYVSFLFVPYTVLLANCMIRLGRRGCLNVAPVRNFIEEGYDGVTDLYVGIYDDLASTNFTDRIAAWENIVECTFRTNNVKLGYLIVDEFHNFETEVYRQSQFGGITNLDFDAFEKAIFLSGTAPEAVADAALQRIGLTGLAKKSMDINELKRSEDLSRGLSSYPTRMFNLIKEKSEVPLGHVHKIRKKVESQPEEALKLLLALFESEPESKAIVVASTTNEVEELACSWRKYFRVVWIHGKLGAAEKVSRTKEFVTDGSMQVLIGTKLVTEGIDIKQLMMVIMLDNRLNIIELIQGVGRLRDGGLCYLLSRKNSWAARNRKGELPPKEGCITEQVREFYGLESKKGKKGQHVGCCGSRTDLSADTVELIERMDRLAEKQATASMSIVALPSSFQESNSSDRYRKYCSSDEDSNTCIHGSANASTNASTNAITTASTNVRTNATTNASTNATTNASTNASTNATTNASTNATTNSSTNATTTASTNVRTSATTTASINVRTSATTTESTNSSTNATTTESTNSSTNATTTESTNSNTSATTTASINVRTSATTTESTNSSTSATTTASINVRTSATTTKSINSSTNATTTESTNSNTNATTTESTNSSTNATTTESTNSSTNATTTESTNSNTSAATTESTNSNTSATTTESTNASAKEDANKDGNAEDNRFHPVTDINKESYKRKGSQMVLLERKKLKAQFPNTSENMNVLQFLGFRSDEIKHLFLYGIDIYFCPEGVFTQYGLCKGCQKMFELCVCWAGQKVSYRRIAWEALAVERMLRNDEEYKEYLEDIEPYHGDPVGYLKYFSVKRREIYSQIQRNYAWYLAITRRRETISVLDSTRGKQGSQVFRMSGRQIKELYFKVWSNLRESKTEVLQYFLNWDEKKCQEEWEAKDDTVVVEALEKGGVFQRLRSMTSAGLQGPQYVKLQFSRHHRQLRSRYELSLGMHLRDQIALGVTPSKVPHWTAFLSMLIGLFYNKTFRQKLEYLLEQISEVWLLPHWLDLANVEVLAADDTRVPLYMLMVAVHKELDSDDVPDGRFDILLCRDSSREVGEMKVSDRRKFEKANFDEFESALNNKNDLVHCPSITLFESIPTEVRSFYEDEKSGLIKVVKFRTGAMDRKRSFEKVVISVMVGKNVKKFLTFVEDEPDFQGGPIPSKYLIPKKINLMVYTLFQVHTLKFNRKDYDTLSLFYLNRGYYNELSFRVLERCHEIASARPNDSSTMRTFTDFVSGAPIVRSLQKSTIRKYGYNLAPYMFLLLHVDELSIFSAYQASLPGEKKVDTERLKRDLCPRKPIEIKYFSQICNDMMNKKDRLGDILHIILRACALNFGAGPRGGAGDEEDRSITNEEPIIPSVDEHGLKVCKLRSPNTPRRLRKTLDAVKALLVSSCACTARDLDIFDDNNGVAMWKWIKILYHEVAQETTLKDSYRITLVPSSDGISLLAFAGPQRNVYVDDTTRRIQLYTDYNKNGSSEPRLKTLDGLTSDYVFYFVTVLRQMQICALGNSYDAFNHDPWMDVVGFEDPNQVTNRDISRIVLYSYMFLNTAKGCLVEYATFRQYMRELPKNAPQKLNFREMRQGLIALGRHCVGSRFETDLYESATSELMANHSVQTGRNIYGVDSFSLTSVSGTTATLLQERASERWIQWLGLESDYHCSFSSTRNAEDVVAGEAASSNHHQKISRVTRKRPREPKSTNDILVAGQKLFGSSFEFRDLHQLRLCYEIYMADTPSVAVQAPPGYGKTELFHLPLIALASKGDVEYVSFLFVPYTVLLANCMIRLGRRGCLNVAPVRNFIEEGYDGVTDLYVGIYDDLASTNFTDRIAAWENIVECTFRTNNVKLGYLIVDEFHNFETEVYRQSQFGGITNLDFDAFEKAIFLSGTAPEAVADAALQRIGLTGLAKKSMDINELKRSEDLSRGLSSYPTRMFNLIKEKSEVPLGHVHKIRKKVESQPEEALKLLLALFESEPESKAIVVASTTNEVEELACSWRKYFRVVWIHGKLGAAEKVSRTKEFVTDGSMQVLIGTKLVTEGIDIKQLMMVIMLDNRLNIIELIQGVGRLRDGGLCYLLSRKNSWAARNRKGELPPKEGCITEQVREFYGLESKKGKKGQHVGCCGSRTDLSADTVELIERMDRLAEKQATASMSIVALPSSFQESNSSDRYRKYCSSDEDSNTCIHGSANASTNASTNAITTASTNVRTNATTNASTNATTNASTNASTNATTNASTNATTNSSTNATTTASTNVRTSATTTASINVRTSATTTESTNSSTNATTTESTNSSTNATTTESTNSNTSATTTASINVRTSATTTESTNSSTSATTTASINVRTSATTTKSINSSTNATTTESTNSNTNATTTESTNSSTNATTTESTNSSTNATTTESTNSNTSAATTESTNSNTSATTTESTNASAKEDANKDGNAEDNRFHPVTDINKESYKRKGSQMVLLERKKLKAQFPNTSENMNVLQFLGFRSDEIKHLFLYGIDIYFCPEGVFTQYGLCKGCQKMFELCVCWAGQKVSYRRIAWEALAVERMLRNDEEYKEYLEDIEPYHGDPVGYLKYFSVKRREIYSQIQRNYAWYLAITRRRETISVLDSTRGKQGSQVFRMSGRQIKELYFKVWSNLRESKTEVLQYFLNWDEKKCQEEWEAKDDTVVVEALEKGGVFQRLRSMTSAGLQGPQYVKLQFSRHHRQLRSRYELSLGMHLRDQIALGVTPSKVPHWTAFLSMLIGLFYNKTFRQKLEYLLEQISEVWLLPHWLDLANVEVLAADDTRVPLYMLMVAVHKELDSDDVPDGRFDILLCRDSSREVGE
what cellular component? what molecular function? what biological process?
GO used to tag database entries GlyProt MouseEcotope sphingolipid transporter activity DiabetInGene GluChem
GO used to tag database entries GlyProt MouseEcotope Holliday junction helicase complex DiabetInGene GluChem
GO used to tag database entries GlyProt MouseEcotope sphingolipid transporter activity DiabetInGene GluChem
GO used in curation of literature what cellular component? what molecular function? what biological process?
A new kind of scientific publishing • Biologist curators annotate experimental observations reported in the biomedical literature to link gene products (such as proteins) with GO terms • International Society of Biocurators http://www.biocurator.org/
converting journal articles into algorithmically processable artifacts Clark et al., 2005 part_of
The logic of GO • OBO Format • http://oboedit.org/ • OWL DL • http://www.co-ode.org/resources/papers/OBO2OWL.pdf • Common Logic http://www.berkeleybop.org/people/cjm/Mungall-bib.html#mungall_experiences_2009
$100 mill. invested in literature curation using GO over 11 million annotations relating gene products described in the UniProt, Ensembl and other databases to terms in the GO
GO provides a controlled system of representations for use in annotating data and literature • multi-species • multi-disciplinary • multi-granularity, from molecules to population
Example of use of the GO A study of 11 breast and 11 colorectal cancers found 13,023 genes The GO tells you what is standard functioning for each these genes By searching for deviations from this standard in the sample, 189 genes were identified as being mutated at significant frequencies and thus as providing targets for diagnostic and therapeutic intervention. Sjöblöm T, et al.Science. 2006 ;314:268-74.
This kind of research only works if we have a common ontology • Data is retrievable • Data is comparable • Data is integratable only to the degree that it is annotated using a common controlled vocabulary (compare the role of seconds, meters, kilograms …)
Overview • The Rise of Applied Ontology • The OBO Foundry • Basic Formal Ontology • How to Build an Ontology • What is a Disease?
GO is amazingly successful in overcoming data silo problems • but it covers only • cellular components • molecular functions • biological processes
The OBO Foundry – to extend the GO to enable intelligent integration of gigantic bodies of heterogeneous data across the entire domain of the life sciences, including clinical medicine – to create an evolving, map-like, computable representation of the entire domain of biological and medical reality Barry Smith, et al., “The OBO Foundry: Coordinated Evolution of Ontologies to Support Biomedical Data Integration”, Nature Biotechnology, 25 (11), 2007
Overview • The Rise of Applied Ontology • The OBO Foundry • Basic Formal Ontology • How to Build an Ontology • What is a Disease?
RELATION TO TIME GRANULARITY rationale of OBO Foundry coverage
Basic Formal Ontology (BFO) Continuant Occurrent (Process, Event) Independent Continuant Dependent Continuant http://ontology.buffalo.edu/bfo/
BFO A simple top-level ontology to support information integration in scientific research No abstracta Nothing propositional No overlap with domain ontologies (for society, for information, …) – built by populating downwards
Three Fundamental Dichotomies • Continuant vs. occurrent • Dependent vs. independent • Type vs. instance
Continuant thing, quality … Occurrent process, event
depends_on Continuant Occurrent process, event Independent Continuant thing Dependent Continuant quality quality depends on bearer .... ..... .......
instance_of types Continuant Occurrent process, event Independent Continuant thing Dependent Continuant quality .... ..... ....... instances
3 kinds of (binary) relations • Between types • human is_a mammal • human heart part_ofhuman • Between an instance and a type • this human instance_of the type human • this human allergic_to the type tamiflu • Between instances • Mary’s heart part_of Mary • Mary’s aorta connected_to Mary’s heart
depends_on Continuant Occurrent process Independent Continuant thing Dependent Continuant quality temperature depends on bearer .... ..... .......
depends_on Continuant Occurrent process, event Independent Continuant thing Dependent Continuant quality, … event depends on participant .... ..... .......
3 kinds of (binary) relations • Between types • human is_a mammal • human heart part_ofhuman • Between an instance and a type • this human instance_of the type human • this human allergic_to the type tamiflu • Between instances • Mary’s heart part_of Mary • Mary’s aorta connected_to Mary’s heart
Definitions of relations Clark et al., 2005 is_a part_of Barry Smith, et al., “Relations in Biomedical Ontologies”, Genome Biology 2005, 6 (5), R46.
Type-level relations presuppose the underlying instance-level relations • A is_a B =def. A and B are types and all instances of A are instances of B • A part_of B =def. Allinstances of A are instance-level-parts-of some instance of B
human testis part_of adult human being • but not • human being has_part human testis • and not even • male human being has_part human testis
The assertions linking terms in ontologies must hold universally Hence type-level relations such as part_of are provided with All-Some definitions
part_of for continuant types • A part_of B =def. • For all x, t if x instance_of A at t then there is some y, y instance_of B at t and x instance_level_part_of y at t • cell membrane part_of cell
part_of for occurrent types • A part_of B =def. • For all x, if x instance_of A then there is some y, y instance_of B and x instance_level_part_of y • EVERY A IS PART OF SOME B
Instances vs. types • Instance-level relations and type-level relations have logically distinct properties • What is symmetric on the level of instances need not be symmetric on the level of types
seminal vesicle adjacent_to urinary bladderNot: urinary bladderadjacent_to seminal vesiclenucleus adjacent_to cytoplasmNot: cytoplasm adjacent_to nucleus
Overview • The Rise of Applied Ontology • The OBO Foundry • Basic Formal Ontology • How to Build an Ontology • What is a Disease?
Blinding Flash of the Obvious Continuant Occurrent (Process, Event) Independent Continuant Dependent Continuant How to create an ontology from the top down