1 / 27

SysMO-SEEK: Sharing Data and Models in Systems Biology

SysMO-SEEK: Sharing Data and Models in Systems Biology. Katy Wolstencroft Stuart Owen Jacky Snoep University of Manchester. SysMO-DB Project. DB. A data access, model handling and data integration platform for Systems Biology: To support and manage the diversity of

hannah-peck
Download Presentation

SysMO-SEEK: Sharing Data and Models in Systems Biology

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SysMO-SEEK: Sharing Data and Models in Systems Biology Katy Wolstencroft Stuart Owen Jacky Snoep University of Manchester

  2. SysMO-DB Project DB A data access, model handling and data integration platform for Systems Biology: • To support and manage the diversity of • Data, Models and experimental protocols from a consortium • Web based • Standards compliant

  3. Pan European collaboration 13 individual projects, >100 institutes Different research outcomes A cross-section of microorganisms, incl. bacteria, archaea and yeast Record and describe the dynamic molecular processes occurring in microorganisms in a comprehensive way Present these processes in the form of computerized mathematical models Pool research capacities and know-how Already running since April 2007 Runs for 3-5 years This year, 2 new projects join and 6 leave Systems Biology of Microorganisms http://www.sysmo.net

  4. Types of data • Multiple omics • genomics, transcriptomics • proteomics, metabolomics • fluxomics, reactomics • Images • Molecular biology • Reaction Kinetics • Models • Metabolic, gene network, kinetic • Relationships between data sets/experiments • Procedures, experiments, data, results and models • Analysis of data

  5. Challenges Heterogeneous data and models Distributed groups of researchers Modellers and experimentalists have different skills, training, experience Scientists want to remain in control Scientists reluctant to share Social and technical challenges

  6. SysMO-DB Dev Team Carole Goble Sergejs Aleksejevs Wolfgang Müller Heidelberg Institute for Theoretical Studies Germany Olga Krebs Katy Wolstencroft University of Manchester, UK Stuart Owen Franco du Preez Jacky Snoep University of Stellenbosch, South Africa University of Manchester, UK Finn Bacall

  7. Social Challenge: Focus GroupSysMO PALs Show what is thereSuggest what is possible Ask for requirements Double check Transmit Disseminate Give requirements Tell priorities Rate outcomes Suggest improvements Collect answers DB team Focus Group Projects

  8. Technical Challenge Rapid and incremental development Driven by the PALs Just enough and just in time , not Just in case No reinvention Sustainable and extensible Migrate to standards Fitting in with normal lab practices

  9. What do we share Protocols for Models Protocol Title Authors Keywords Description Assumptions Equations Numerical Methods/Algorithms Computational Tools Parameter Estimation Techniques Limitations References + + Methods Models Data + Results All SysMO Assets

  10. A Tree View of Assets SOP SOP SOP Investigation Studies Assay ISA infrastructure provides a directory structure for experiments http://isatab.sourceforge.net/ Construction Validation

  11. Incentives for sharing • Safe haven for data • Credit and attribution • Help with exporting to public repositories (e.g. One-click export to ArrayExpress, PRIDE etc) • A repository for “supplementary materials” in publications • Linking publications and data • Access other resources through a SEEK gateway

  12. Just Enough Sharing Access Permissions ...we don’t talk about security

  13. Just Enough sharing JERM SOP SysMOLab Wiki Fetch on Request COSMIC Alfresco MOSES Wiki ANOTHER Direct Upload A DATA STORE

  14. How do we share “Just Enough Results Model” What type of data is it Microarray, growth curve, enzyme activity… What was measured Gene expression, OD, metabolite concentration…. What do the values in the datasets mean Units, time series, repeats…. Based on: Minimum information models e.g. MIAME, MIAPE, MIRIAM Biological ontologies e.g. Gene Ontology, MGED, SBO Bioportal web service used in SysMO-SEEK for: Concept lookup and visualisation JERM

  15. How do we share • Share JERM templates developed by SysMO-DB, PALs and consortium • Spreadsheet templates • Database Schemas • Encourage uptake throughout SysMO • transcriptomics • metabolomics • proteomics etc….

  16. RightField: Annotation by Stealth

  17. Identifying Biological Objects What do you have in your data? Proteins/enzymes, genes/expression levels, metabolites Where/how do these objects interact? Pathways, flux, experimental conditions What models describe these interactions Possible when using common frameworks, naming schemes and controlled vocabularies

  18. Following Standards We recommend formats but we do not enforce them Protocols and SOPs – Nature Protocols Data – JERM models and community minimum information models Models – SBML and related standards Publications – PubMed and DOI If you follow the prescribed formats, you get more out, but if you don’t, you can still participate Lowering the adoption barrier

  19. SEEK, the eLaboratory A dynamic resource for analysis as well as browsing Automatic comparison of data from inside files Understanding where and how data and models are linked Running simulations with new experimental data Running analyses and workflows over the data and models

  20. Workflows from myExperiment • Data preparation, annotation and analysis • Systems Biology workflow Pack on myExperiment Microarray analysis and text mining Created by Afsaneh Maleki-Dizaji from SUMO, University of Sheffield Based on previous work by Paul Fisher, University of Manchester http://www.myexperiment.org/workflows/187

  21. SEEK as a data analysis and meta analysis service • SBML model construction and population • Calibration workflow • Data requirements • Parameterised SBML model • Experimental data • Metabolite concentrations from key results database • Calibration by COPASI web service Peter Li

  22. Data analysis and meta analysis SEEK Analysis Service with pre-cooked analysis tools. • Calibration workflow • Data requirements • Parameterised SBML model • Experimental data • Metabolite concentrations from key results database • Calibration by COPASI web service Load model: Load data: GO Peter Li

  23. Why it works for us • A solution that fits in with current practices • Start simple, show benefits, add more • Engage with the people actually doing the work • PhD students, Post-docs • Build to the PALs requirements • Respect publication cycles • Respect cultural differences • Scientists stay in control

  24. SysMO Methods Spreading • Virtual Liver • Mueller, via HITS • Lungsys • SBCancer • EraSysBio+ • Eukaryotic organisms • Interactions between host and pathogen • Human disease • Multi scale modelling

  25. Acknowledgements SysMO-DB Team SysMO-PALS myGrid, Hits and JWS Online EMBL-EBI, MCISB http://www.sysmo-db.org

More Related