200 likes | 322 Views
Taverna Workflows for Systems Biology. Katy Wolstencroft School of Computer Science University of Manchester. Workflow management system Sophisticated analysis pipelines A set of services to analyse or manage data (either local or remote) Data flow through services
E N D
Taverna Workflows for Systems Biology Katy Wolstencroft School of Computer Science University of Manchester
Workflow management system Sophisticated analysis pipelines A set of services to analyse or manage data (either local or remote) Data flow through services Control of service invocation What is a Taverna Workflow?
Interoperability, Integration and Collaboration Access to distributed and local resources Iteration over data sets Automation of data flow Agile methods development Extensible Experimental protocols Taverna Workflows
Workflows are ideal for… High throughput analysis Transcriptomics, proteomics, Next Gen sequencing, etc Data integration, data interoperation Data management Model construction Data format manipulation Systems Biology
Taverna Workbench Workflow engine to run workflows Web Services Scripts Programming libraries e.g. KEGG e.g. beanshell, R e.g. libSBML List of services Construct and visualise workflows
Taverna Workbench Freely available open source Current Version 2.2 70,000+ downloads across version Part of the myGrid Toolkit Nucleic Acids Res. 2006 Jul 1;34(Web Server issue):W729-32. Taverna: a tool for building and running workflows of services. Hull D, Wolstencroft K, Stevens R, Goble C, Pocock MR, Li P, Oinn T.
myGrid Open Suite of Tools Workflow Repository Client User Interfaces Workflow GUI Workbench Third Party Tools Service Catalogue Provenance Store Workflow Server Web Portal Activity and Service Plug-in Manager Open Provenance Model Programming and APIs Secure Service Access
Using gene-expression patterns associated with two lymphoma types to predict the type of an unknown sample Escherichia coli : From cDNA Microarray Raw Data to Pathways and Published Abstracts Wei Tan Univ. Chicago, CABIG Identify differentially expressed genes using t-test with R Peter Li, MCISB SysMO SUMO: Systems Understanding of Microbial Oxygen responses Afsaneh Maleki-Dizaji, University of Sheffield High Throughput Experiments
Workflows for Model Building Results from experiments in systems biology -> related to mathematical models in SBML Workflows can link data and models Workflows can create models SBML Location of components Species Reactions
Model construction workflow Input: list of ORFs 1. Get reaction info 2. Create compartments 3. Create species Get annotations 4. Create reactions Output: SBML file Peter Li et al, MCISB, myGrid
Peter Li et al, MCISB, myGrid • Integrating libSBML into Taverna
Workflows for Data Integration Read enzyme names from SBML Query maxd database using enzyme names Calculate colours based on gene expn level Create new SBML model with new colour nodes Mapping transcriptomic data onto SBML models
Reuse, Recycle, Repurpose Workflows SUMO HUMAN Microarray CEL file to candidate pathways From cDNA Microarray Raw Data to Pathways and Published Abstracts
Reuse, Recycle, Replay Workflows Workflows through web interface Metware: Workflows for metabolomics, Netherlands/Germany Steffen Neumann, Leibniz Institute of Plant Biochemistry
Workflows in e-Laboratories SysMO SEEK • e-Laboratory for interlinking and sharing data, models, SOPS and workflows for Systems Biology in Europe • Workflows for data analysis
Summary • Informatics in Systems Biology relies on data integration and large-scale data analysis • Taverna workflows are a mechanism for linking together resources and analyses • myExperiment allows you to reuse workflows and benefit from others work
More information Taverna http://www.taverna.org.uk myExperiment http://www.myexperiment.org http://wiki.myexperiment.org BioCatalogue http://www.biocatalogue.org SysMO-SEEK http://www.sysmo-db.org