1 / 1

The BioInvestigation Index – Standards and Infrastructure for Omics Data

European Bioinformatics Institute is an Outstation of the European Molecular Laboratory. The BioInvestigation Index – Standards and Infrastructure for Omics Data. Philippe Rocca-Serra, Marco Brandizi, Nataliya Sklyar, Eamonn Maguire, Chris Taylor, Gabriella Rustici and Susanna-Assunta Sansone.

breena
Download Presentation

The BioInvestigation Index – Standards and Infrastructure for Omics Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. European Bioinformatics Institute is an Outstation of the European Molecular Laboratory The BioInvestigation Index – Standards and Infrastructure for Omics Data Philippe Rocca-Serra, Marco Brandizi, Nataliya Sklyar, Eamonn Maguire, Chris Taylor, Gabriella Rustici and Susanna-Assunta Sansone The standards scenario - Introduction 1. Growing complexity of datasets The marriage of traditional approaches with genomics, transcriptomics, proteomics and metabol/nomics technologies (hereafter referred as ‘omics’) has created not only opportunities, but also substantial new informatics challenges. For example, consider the reporting of a complex multi-assay study looking at the effect, on a number of subjects, of a compound inducing liver damage by characterizing the urine metabolic profile (by mass spectroscopy), measuring liver protein and gene expression (by mass spectrometry and DNA microarrays, respectively), and conducting conventional histopathological analysis. However, being focused on particular communities’ interests, be their individual ‘omics’ technologies or specific biological/biomedical disciplines, leads to duplication of effort, and more seriously, the development of (largely arbitrarily) different standards. This fragmentation severely hinders the interoperability of databases and tools, reporting standards and ultimately integration of datasets. ArrayExpress and PRIDE - EBI production systems for microarray and proteomics data respectively - illustrate the implementation of such a scenario. Figure 2 Independent databases, with different submission/exchange formats; diverse representations of the metadata; use of different terminologies. Figure 1 Example of a multi-assay study. Interoperable reporting standards facilitate the development of standards-compliant products by academic and commercial software developers, instrument vendors, etc. They do so by limiting the range and variability of standards for such parties to consider. It is pivotal that such complexmetadata (i.e., sample characteristics, study design, assay execution, sample-data relationships) are reported in a standard manner to correctly interpret the final results (or data) that they contextualize. 2. Fragmentation of reporting standards 3. Towards interoperable reporting standards Many groups have risen to this challenge; standards for collecting, describing, formatting, submitting and exchanging both metadata and data are either under development or have been released. Several standards initiatives addressing particular technologies or defined domains of application (e.g., genomics, microarray, proteomics, metabol/nomics and system biology models) have emerged from the academic community, in many cases with the support of commercial organizations such as instrument vendors. Such initiatives are focused on supporting tool interoperability and data exchange among public and proprietary systems, by developing 3 kindsof (de facto) reporting standards: minimal information specification (checklists), semantics (ontologies) and syntax (file formats). Fortunately, several synergistic activities foster the harmonization of the 3 kindsof standards being developed. Over 22 groups participate in the MIBBI project, which offers a one-stop shop for those exploring the range of extant ‘minimum information’ checklists, and which fosters collaborative and integrative development [1]. More than 60 groups participate in the OBO Foundry [2] to coordinate the development of orthogonal, interoperable ontologies, such as OBI [3], to support data integration Several groups participate in the FuGE project to develop a generic data model to underpin a variety of XML-based file formats [4]. And recently, a growing number of communities have started to work collaboratively on ISA-TAB, a tabular framework for presenting metadata [5], and serve to as a user-friendly presentation layer for XML-based formats (via a XSLT). BioInvestigation Index – Overview The BioInvestigation Index infrastructure aims to create a common structured representation and storage mechanism for metadata,and the sample-data relationships for biological, biomedical and environmental studies, which commonly range from simple one assay-based to complex multi-assay studies, as illustrated in Figure 1. The infrastructure relies on existent production systems, such as ArrayExpress and PRIDE, but avoids the fragmentation by leveraging on the synergistic reporting standards described in section 3. The infrastructure’s main components and their use of the synergistic reporting standards are described in the figure below. Prototype launch in Fall/Winter 2008: www.ebi.ac.uk/bioinvindex. Information and announcements at: www.ebi.ac.uk/net-project. References • Taylor CF, Field D, Sansone SA,… Rocca-Serra P et al. (2008) The MIBBI Project. Nat Biotechnol. Aug;26(8):889-896. http://http://www.mibbi.org • Smith B, Ashburner M, Rosse C,… Rocca-Serra P, …Sansone SA et al. (2007) The OBO Foundry. Nat Biotechnol, Nov;25(11):1251-5. http://www.obofoundry.org • Ontology for Biomedical Investigations (OBI): http://obi-ontology.org • Jones AR, Miller M, Aebersold R,… Sansone SA et al. (2007) The Functional Genomics Experiment model (FuGE). Nat Biotechnol. Oct;25(10):1127-1133. http://fuge.sf.net • Sansone SA, Rocca-Serra P, Brandizi M,… Taylor CF et al. (2008) The First MGED RSBI (ISA-TAB) Workshop. OMICS. Jun;12(2):143-9. http://isatab.sf.net Acknowledgements The EU integrated project CarcinoGENOMICS (http://www.carcinogenomics.eu, LSHB-2006-037712), EU network of Excellence NuGO (http://www.nugo.org, NoE-503630), BBSRC grants (workshop on standards and ontology, BB/E025080/1, and MIBBI BB/G000638/1), UK NERC Bioinformatics Centre partnership fund and the EMBL-EBI. The authors also acknowledge the contributions of the ArrayExpress and Pride teams; also the MIBBI, OBO Foundry, OBI, FuGE and ISA-TAB communities.

More Related