1.88k likes | 1.92k Views
Enhance biomedical research by utilizing ontology for data classification, integration, and querying. Explore the methodology of annotations and the future of shared ontologies in healthcare and clinical research.
E N D
Ontology and the Future of Biomedical Research • Barry Smith • http://ifomis.org
Institute for Formal Ontology and Medical Information Science • Saarland University
From chromosome to disease
Problem: • how to reason with data deriving from different sources, each of which uses its own system of classification ?
Solution: Ontology !
Examples of current needs for ontologies in biomedicine • to enforce semantic consistency within a database • to enable data sharing and re-use • to enable data integration (bridging across data at multiple granularities) • to allow querying
What is needed • strong general purpose classification hierarchies created by domain specialists • clear, rigorous definitions • thoroughly tested in real use cases • updated in light of scientific advance
The actuality (too often) • myriad special purpose ‘light’ ontologies, prepared by ontology engineers and deposited in internet ‘repositories’ or ‘registries’
General trend • on the part of NIH, FDA and other bodies to consolidate ontology-based standards for the communication and processing of biomedical data.
Responses to this trend • Old:UMLS (Unified Medical Language System) – rooted in the faithfulness to the ways language is used by different medical communities
U M L S SNOMED DEMONS
U M L S • congenital absent nipple is_a nipple • cancer documentation is_a cancer • disease prevention is_a disease – repair and maintenance of wheelchair is_a disease – water is_a nursing phenomenon – part-whole =def. a nursing phenomenon with topology part-whole
MeSH • MeSH Descriptors Index Medicus Descriptor Anthropology, Education, Sociology and Social Phenomena (MeSH Category) Social Sciences • Political Systems National Socialism
MeSH • National Socialism is_a Political Systems • National Socialism is_a Anthropology ... • National Socialism is_a Social Sciences • National Socialism is_a MeSH Descriptors
New:Semantic Web deposits • Pet Profile Ontology • Review Vocabulary • Band Description Vocabulary • Musical Baton Vocabulary • MusicBrainz Metadata Vocabulary • Kissology
http://www.w3.org/ • Beer Ontology • all instances of hops that have ever existed are necessarily ingredients of beer.
OWL-based ontologies … • some nice computational resources, • but low expressivity • and few genuinely scientific demonstration cases
OWL’s syntactic regimentation is not enough to ensure high-quality ontologies • – the use of a common syntax and logical machinery and the careful separating out of ontologies into namespaces does not solvethe problem of ontology integration
Both UMLS- and OWL-type responses involve ad hoc creation of new terminologies by each communityMany of these terminologies remain as torsos, gather dust, poison the wells, ...
How to do better? • How to create the conditions for a step-by-step evolution towards high quality ontologies in the biomedical domain • which will serve as stable attractors for clinical and biomedical researchers in the future?
A basic distinction • type vs. instance • science text vs. clinical document • dog vs. Fido
Instances are not represented in an ontology built for scientific purposes • It is the generalizations that are important • (but instances must still be taken into account)
Ontology = A Representation of Types • Each node of an ontology consists of: • preferred term (aka term) • term identifier (TUI, aka CUI) • synonyms • definition, glosses, comments
Each term in an ontology represents exactly one type • hence ontology terms should be singular nouns • National Socialism is_a Political Systems
An ontology is a representation of types • We learn about types in reality from looking at the results of scientific experiments in the form of scientific theories – which describe not what is particular in reality but rather what is general • Ontologies need to exploit the evolutionary path to convergence created by science
High quality shared ontologies build communities • NIH, FDA trend to consolidate ontology-based standards for the communication and processing of biomedical data. • caBIG / NECTAR / BIRN / BRIDG ...
The Methodology of Annotations • GO employs scientific curators, who use experimental observations reported in the biomedical literature to link gene products with GO terms in annotations. • This gene product exercises this function, in this part of the cell, leading to these biological processes
The Methodology of Annotations • This process of annotating literature leads to improvements and extensions of the ontology, which in turn leads to better annotations • This institutes a virtuous cycle of improvement in the quality and reach of both future annotations and the ontology itself. • Annotations + ontology taken together yield a slowly growing computer-interpretable map of biological reality.
The OBO Foundry • A subset of OBO ontologies, whose developers have agreed in advance to accept a common set of principles designed to ensure • intelligibility to biologists (curators, annotators, users) • formal robustness • stability • compatibility • interoperability • support for logic-based reasoning
The OBO Foundry Custodians • Michael Ashburner (Cambridge) • Suzanna Lewis (Berkeley) • Barry Smith (Buffalo/Saarbrücken)
The OBO Foundry A collaborative experiment participants have agreed in advance to a growing set of principles specifying best practices in ontology development designed to guarantee interoperability of ontologies from the very start
The OBO Foundry The developers of each ontology commit to its maintenance in light of scientific advance, and to soliciting community feedback for its improvement. They commit to working with other Foundry members to ensure that, for any particular domain, there is community convergence on a single reference ontology.
The OBO Foundry Initial Candidate Members of the OBO Foundry • GO Gene Ontology • CL Cell Ontology • SO Sequence Ontology • ChEBI Chemical Ontology • PATO Phenotype Ontology • FuGO Functional Genomics Investigation Ontology • FMA Foundational Model of Anatomy • RO Relation Ontology
The OBO Foundry Under development – Disease Ontology • NCI Thesaurus • Mammalian Phenotype Ontology • OBO-UBO / Ontology of Biomedical Reality • Organism (Species) Ontology • Plant Trait Ontology • Protein Ontology • RnaO RNA Ontology
The OBO Foundry Considered for development • Environment Ontology • Behavior Ontology • Biomedical Image Ontology • Clinical Trial Ontology
The OBO Foundry The OBO Foundry CRITERIA • The ontology is open and available to be used by all. • The developers of the ontology agree in advance to collaborate with developers of other OBO Foundry ontology where domains overlap. • The ontology is in, or can be instantiated in, a common formal language.
The OBO Foundry CRITERIA • The ontology possesses a unique identifier space within OBO. • The ontology provider has procedures for identifying distinct successive versions. • The ontology includes textual definitions for all terms.
The OBO Foundry CRITERIA • The ontology has a clearly specified and clearly delineated content. • The ontology is well-documented. • The ontology has a plurality of independent users.
The OBO Foundry CRITERIA • The ontology uses relations which are unambiguously defined following the pattern of definitions laid down in the OBO Relation Ontology.* • *Genome Biology 2005, 6:R46
The OBO Foundry The OBO Foundry CRITERIA • Further criteria will be added over time in order to bring about a gradual improvement in the quality of the ontologies in the Foundry