420 likes | 591 Views
Goal and Status of the OBO Foundry. Barry Smith. H ow create broad-coverage semantic annotation systems for biomedicine?. Semantic Web, Moby, wikis, crowd sourcing, NLP, etc . let a million flowers (and weeds) bloom to create integration rely on (automatically generated?) post hoc mappings
E N D
Goal and Status of the OBO Foundry Barry Smith
How create broad-coverage semantic annotation systems for biomedicine? • Semantic Web, Moby, wikis, crowd sourcing, NLP, etc. • let a million flowers (and weeds) bloom • to create integration rely on (automatically generated?) post hocmappings • The result is noisy
for science Foundry alternative:prospective standardization • develop high quality annotation resources in a collaborative, community effort • creating an evolutionary path towards improvement of terminologies of the sort we find elsewhere in science
The methodology of annotations • science basis of the GO: trained experts curating peer-reviewed literature • different model organism databases employ scientific curators who use the experimental observations reported in the biomedical literature to associate GO terms with gene products in a coordinated way
A set of standardized textual descriptions of • cellular locations • molecular functions • biological processes • used to annotate the entities represented in the major biochemical databases • thereby creating integration across these databases and making them available to semantic search
and also need to extend the GO by engaging ever broader community support for the addition of new terms and for the correction of errors need to extend the methodology to other domains, including clinical domains
this requires that we • establish common rules governing best practices for creating ontologies and for using these in annotations • apply these rules to create a complete suite of orthogonal interoperable biomedical reference ontologies
2003 • shared portal + • low regimentation • http://obo.sourceforge.net NCBO BioPortal
2006 The OBO Foundryhttp://obofoundry.org/
A prospective standard designed to guarantee interoperability of ontologies from the very start (contrast to: post hoc mapping) established March 2006 12 initial candidate OBO ontologies – focused primarily on basic science domains several being constructed ab initio by influential consortia who have the authority to impose their use on large parts of the relevant communities.
OBO Foundry = a subset of OBO ontologies, whose developers have agreed in advance to accept a common set of principles reflecting best practice in ontology development designed to ensure • tight connection to the biomedical basic sciences • compatibility • interoperability, common relations • formal robustness • support for logic-based reasoning The OBO Foundryhttp://obofoundry.org/
CRITERIA • The ontology is OPENand available to be used by all. • The ontology is in, or can be instantiated in, a COMMON FORMAL LANGUAGE. • The developers of the ontology agree in advance to COLLABORATE with developers of other OBO Foundry ontology where domains overlap. CRITERIA The OBO Foundryhttp://obofoundry.org/
UPDATE: The developers of each ontology commit to its maintenance in light of scientific advance, and to soliciting community feedback for its improvement. • ORTHOGONALITY: They commit to working with other Foundry members to ensure that, for any particular domain, there is community convergence on a single controlled vocabulary. CRITERIA The OBO Foundryhttp://obofoundry.org/
for science orthogonality of ontologies implies additivity of annotations • if we annotate a database or body of literature with one high-quality biomedical ontology, we should be able to add annotations from a second such ontology without conflicts • AND WITHOUT THE NEED FOR MAPPINGS The OBO Foundryhttp://obofoundry.org/
CRITERIA CRITERIA • IDENTIFIERS: The ontology possesses a unique identifier space within OBO. • VERSIONING: The ontology provider has procedures for identifying distinct successive versions to ensure BACKWARDS COMPATIBITY with annotation resources already in common use • The ontology includes TEXTUAL DEFINITIONS and where possible equivalent formal definitions of its terms.
CRITERIA • CLEARLY BOUNDED: The ontology has a clearly specified and clearly delineated content. • DOCUMENTATION: The ontology is well-documented. • USERS: The ontology has a plurality of independent users. The OBO Foundryhttp://obofoundry.org/
CRITERIA • COMMON ARCHITECTURE: The ontology uses relations which are unambiguously defined following the pattern of definitions laid down in the OBO Relation Ontology.* • * Smith et al., Genome Biology 2005, 6:R46 The OBO Foundryhttp://obofoundry.org/
top level mid-level domain level Basic Formal Ontology (BFO) Extension Strategy – Downward Population
OGMS Downward Population + Hub-Spokes Strategy
OGMS Cardiovascular Disease Ontology Genetic Disease Ontology Cancer Disease Ontology Genetic Disease Ontology Immune Disease Ontology Environmental Disease Ontology Oral Disease Ontology Infectious Disease Ontology …
OGMS Cardiovascular Disease Ontology Genetic Disease Ontology Cancer Disease Ontology Genetic Disease Ontology Immune Disease Ontology Environmental Disease Ontology Oral Disease Ontology Infectious Disease Ontology …
BFO, OGMS, and IDO • Material Entity • Disposition • Process • Disorder • Disease • Disease Course • Infection • Infectious Disease • Infectious Disease Course
OGMS Cardiovascular Disease Ontology Genetic Disease Ontology Cancer Disease Ontology Genetic Disease Ontology Immune Disease Ontology Environmental Disease Ontology Oral Disease Ontology Infectious Disease Ontology IDO Staph Aureus IDO MRSA IDO Australian MRSA IDO Australian Hospital MRSA …
How IDO evolves IDOMAL IDOCore IDOHIV CORE and SPOKES: Domain ontologies IDOFLU IDORatSa IDORatStrep IDOStrep IDOSa SEMI-LATTICE: By subject matter experts in different communities of interest. IDOMRSA IDOAntibioticResistant IDOHumanSa IDOHumanStrep IDOHumanBacterial
Status Successes New ontologies being added to the OBO library Advance in cross-product methodology
Environment (EnvO, EO) Population-level ontologies
Successes The OBO Foundry strategy for ontology collaboration and reuse is being replicated in major grant-funded projects
Successes Huge and continuing expansion in the awareness of the need for re-using ontologies Huge and continuing expansion in ontology software created to support Foundry efforts (Ontobee, Mireot, …)
Current status Coordinating editors: Michael Ashburner Chris Mungall Suzanna Lewis Alan Ruttenberg Richard Scheuermann Barry Smith
New operations committee • https://code.google.com/p/obo-foundry-operations-committee/wiki/OutreachWG Mathias Brochhausen Melanie Courtot Melissa Haendel Janna Hastings Chris Mungall Alan Ruttenberg Ramona Walls
Ontologies admitted to full membership afte first phase of reviews • CHEBI: Chemical Entities of Biological Interest • GO: Gene Ontology • PATO: Phenotypic Quality Ontology • PRO: Protein Ontology • XAO: Xenopus Anatomy Ontology • ZFA: Zebrafish Anatomy Ontology
Current status Next round of candidates for review OGMS: Ontology for General Medical Science OBI: Ontology for Biomedical Investigations CL: Cell Ontology IDO: Infectious Disease Ontology
Ontology for General Medical Science Jobst Landgrebe (former Co-Chair of the HL7 Vocabulary Group, now Head of Datamining at Allianz Healthcare): • “the best ontology effort in the whole biomedical domain by far”