170 likes | 296 Views
Experiences from the NCBO OBO-to-OWL Mapping Effort. Dilvan A. Moreira, University of São Paulo Mark A. Musen, Stanford University. OBO Format. Evolved from DAG representation initially created for Gene Ontology Adopted by dozens of biomedical ontologies stored in the OBO repository
E N D
Experiences from the NCBO OBO-to-OWL Mapping Effort Dilvan A. Moreira, University of São Paulo Mark A. Musen, Stanford University
OBO Format • Evolved from DAG representation initially created for Gene Ontology • Adopted by dozens of biomedical ontologies stored in the OBO repository • Used by most GO-based data analysis tools • Designed for • human readability • extensibility • minimal redundancy
OWL – Web Ontology Language • W3C “recommendation” • Offers important advantages • A well defined, standardized syntax and semantics • Interoperability with ontologies created in other domains of science • Growing support from open-source and commercial tools • Editors • Parsers • Classifiers • Reasoning systems
Why map from OBO to OWL? • Many Bio-Ontologies are modeled in OWL (e.g., NCI Thesaurus, BioPAX, SNOMED-CT) • OBO format has not been adopted outside the Bio-ontology community, where OWL is the recognized standard • Bio-Ontologies need to interoperate with other ontologies used throughout e-science • The standardization of OWL is leading to many commercial ontology-oriented tools that biologists might wish to use
Requirements for mapping OBO to OWL • Map all OBO format constructs in use to OWL • Make no assumptions other than those written in the OBO format specification itself • Ensure no loss of information • Enable “round trip” conversion (OBO to OWL and back again)
Mapping clarifies the OBO specification • OBO syntax is not uniformly documented • No complete BNF exists to define the grammar • Some new constructs lack clear definitions and have not yet been used by the community • OBO Semantics sometimes need be clarified using textual style guides that do not adopt formal representations
Both languages support two types of elements • Semantic informationdefines classes and relationships about which computers can reason automatically • Textual propertiesoffer the intended meanings of ontology elements for human consumption; such entries include names, textual definitions, descriptions, usage notes, and so on
format-version: 1.0 date: 28:11:2006 23:22 saved-by: dilvan auto-generated-by: OBO-Edit 1.001 default-namespace: test ontology remark: Modified snipet of the cell ontology [Term]id: CL:0000000 name: cell def: "Minute protoplasmic masses that make up organized tissue." [MESH:A.11] [Term]id: CL:0000003 name: cell in vivo is_a: CL:0000000 ! Cell [Term]id: CL:0000026 name: nurse cell related_synonym: "nurse cell" [] xref_analog: FBbt:00004878 is_a: CL:0000003 ! cell in vivo relationship: develops_from CL:0000000 ! Cell … [Typedef] id: develops_from name: develops_fromis_transitive: true <?xml version="1.0"?> <!DOCTYPE rdf:RDF [ … ]> <rdf:RDF … > <owl:Ontology rdf:about=""> <rdfs:comment …> Modified snipet of the cell ontology </…> <oboInOwl:hasDate …> 2006-11-28T23:22:00 </…> <oboInOwl:savedBy …> dilvan </…> <oboInOwl:hasDefaultNamespace …> test ontology </…> </owl:Ontology> <owl:Class rdf:ID="CL_0000000"> <rdfs:label …> cell </…> <oboInOwl:hasDefinition> <rdf:Description> <rdf:type rdf:resource="&oboInOwl;Definition"/> <rdfs:label …> Minute protoplasmic … tissue.</…> <oboInOwl:hasDbXref> <rdf:Description> <rdf:type rdf:resource="&oboInOwl;DbXref"/> <rdfs:label …>MESH:A.11</…> </rdf:Description> </oboInOwl:hasDbXref> </rdf:Description> </oboInOwl:hasDefinition> </owl:Class> <owl:Class rdf:ID="CL_0000003"> <rdfs:subClassOf rdf:resource="#CL_0000000"/> <rdfs:label …> cell in vivo </rdfs:label> </owl:Class> … <owl:ObjectProperty rdf:ID="UNDEFINED_develops_from"> <rdf:type rdf:resource="&owl;TransitiveProperty"/> <rdfs:label …> develops_from </…> </owl:ObjectProperty>… </rdf:RDF> Example of Mapping OBO OWL
Results • We converted all 40 OBO Format files in the OBO repository (November 2006) to OWL with no errors detected by Protégé • Of these 40 files, 30 were converted back to OBO Format without loss of information, as evidenced by running the obodiff tool • Of the remaining 10 files that could not make the round trip • Five had illegal characters in some of their OBO ids • Three used new OBO 1.2 constructs that did not pass obodiff criteria for equality • Two had syntax errors in their original form
Conclusions • A translator enables interconversion of OBO format and standard OWL • Underspecification of OBO format has made the development of OBO-to-OWL translators difficult in the past • Even when the formal specification is incomplete, languages such as OBO format often have de facto specifications based on usage patterns in the community • Work with the OBO community enabled us to clarify and document underspecified elements of OBO format, enabling implementation of a robust translation system
Acknowledgments • Stuart Aitkin • John Day-Richter • Suzanna Lewis • Chris Mungall • Nigam Shah http://www.bioontology.org/wiki/index.php/OboInOwl:Main_Page