200 likes | 299 Views
Developing an application ontology for biomedical resource annotation and retrieval: challenges and lessons learned . C. Torniai , M. Brush, N. Vasilevsky , E. Segerdell , M. Wilson, T. Johnson, K. Corday, C. Shaffer and M. Haendel ICBO 2011. Outline. eagle- i project Aims
E N D
Developing an application ontology for biomedical resource annotation and retrieval:challenges and lessons learned C. Torniai, M. Brush, N. Vasilevsky, E. Segerdell, M. Wilson, T. Johnson, K. Corday, C. Shaffer and M. Haendel ICBO 2011
Outline • eagle-i project • Aims • Ontology role • eagle-i ontology • Requirements • Implementation • Implementation choices • Challenges c o n s o r t i u m
eagle-i NIH funded pilot project working to make scientific resources more visible via a federated network of nine institutional repositories • Index invisible resources • reagents, protocols, techniques, instruments, expertise, organisms, software, training, human studies, biological specimens, etc. • Ontology-driven approach to research resource annotation and discovery • Facilitate development of shared semantic entities that can be referenced in publications, databases, experiments, etc. c o n s o r t i u m
Ontology development drivers • Represent collected resource information • Use theset of ontologiesto control the data collection and search applications user-interface (UI) and logic • Build a set of ontologies that are reusable and interoperable with other ontologies and existing efforts for representing biomedical entities c o n s o r t i u m
Ontology role in eagle-i architecture NIF, PubMedEntrezGene Search Application Federated Network eagle-iontologies Repositories (RDF) Data Collection Application Resource information collection c o n s o r t i u m
Implementation c o n s o r t i u m
Ontology layers Goal: to decouple research resources representation from information used for application appearance and behavior • Application specific module • Classes, annotation properties and individuals required to drive the UIs • eagle-i core ontology • Classes and properties used to represent information about biomedical research resources • MIREOT files • Externally sourced classes and properties c o n s o r t i u m
eagle-i core andMIREOTedsources eagle-i core ontology: 1283 classes, 56 object properties, and 61 data properties. c o n s o r t i u m
Application-specific module Contains properties and classes required to drive the UIs of the data collection and search applications • UI Annotation Definition file • Definition of UI annotation properties and sets of values for these properties • UI Annotations file • Holds annotations made on eagle-I core and MIREOTed classes and properties c o n s o r t i u m
Examples of annotation values and use c o n s o r t i u m
Data Collection Application ‘eagle-i preferred definition’ is used for tooltips Classes annotated with ‘primary resource type’ ‘eagle-i preferred label’ is used for the display name Property annotated as ‘’primary property’ Construct insert is an example of a resource annotated as an ‘embedded class’, Technique is annotated as ‘referenced taxonomy’
Challenges and benefits • Reuse of existent ontologies • Ontology Layers • Application-specific module • Community coordination and alignment • Best practices and tools c o n s o r t i u m
Reuse of existent ontologies • BFO and the relation ontology (RO) • OBO Foundry orthogonality principle Advantages • Integration with other ontologies • Ease the design process • Data integration and publication (Linked Open Data) Challenges • Need to exclude some classes (continuant, occurrent) from UI visualization after the inferred module has computed • Domain and Range in RO not specified or not specific enough for an application • Not all relevant ontologies are built using BFO and RO c o n s o r t i u m
Ontology layers Advantages • Effective means to drive an application UI while maintaining interoperability with external ontologies and data sources • Facilitate parallel concurrent development Challenges • Keeping the annotations current with the core module • Risk of excessive proliferation of annotation properties as quick way to simplify application development complexity c o n s o r t i u m
Application-specific module Requirements for bridging the gap between an application and domain-specific ontologies • Application-specific labels and definitions • Exclusion of sets of classes and properties from the model used by the application • Restriction of domain and range for some imported properties • Definition of display order of object and data properties at class level c o n s o r t i u m
Community coordination Commitment to collaboration with similar efforts aimed at resource modeling • Aligned high level models with NIF, RDS, VIVO • Service, instrument (device) implemented in OBI and reused by NIF and eagle-i • Coordinated representation of reagents, biospecimens, and genotype information (in progress) Challenges • Process is time consuming and it requires extra implementation efforts • Implement and import back from reference ontologies • Application ontologies have peculiar requirements • Example: Service hierarchy in eagle-i based on type of process rather than input and output of the process (OBI) c o n s o r t i u m
Best practices and tools • Reusing/referencing existent ontologies • Ontofox, OWL module extractor, NCBO extractor service • Have tools integrated in ontology editors (Protégé) • Effective methods for managing and syncing MIREOTed terms • Have several “community views” or ‘slims’ that could be directly imported with different level of complexity c o n s o r t i u m
Conclusion • Developing an ontology-driven application has been an important benchmark for usage of biomedical ontologies • We have designed a layered set of ontologies, consisting of a broadly applicable core ontology and application-specific module • Requirements and principles to inform a general design pattern • Future steps • Refining, documenting and sharing requirements and lessons learned • Engage in efforts addressing the issues we have experienced c o n s o r t i u m
Thank you eagle-i core module: http://code.google.com/p/eagle-i/ eagle-i search: http://eagle-i.net username: ohsu-guest password: eagle-i-ohsu Carlo Torniai torniai@ohsu.edu Acknowledgments: Ted Bashor, Rob Frost, Larry Stone and Daniela Bourges Project funded through NIH/NCRR ARRA award #U24RR029825 c o n s o r t i u m