250 likes | 254 Views
Join us for a workshop on functional modeling of biological data, covering topics such as gene annotation, pathway analysis, and network analysis. Learn how to add functional annotation to your own data and gain insights into gene functions and biological processes.
E N D
1. Introduction & Workshop Goals COST Functional Modeling Workshop 22-24 April, Helsinki
1. Goals of the Workshop • Enable participants to functionally model their own data. • Different data types – arrays, RNA-Seq, proteomics, etc • GO enrichment, pathways, network analysis. • Add additional functional annotation as required? • Provide ongoingsupport to assist with functional modeling. • Add/supplement existing functional annotation • New tools for analysis • Advice, help with biocomputingneeds • Request feedback from participants about what resources are required to support their research.
2. Resources for the Workshop • All workshop materials are available online at AgBase • Presentations • Examples • Links and references http://www.agbase.msstate.edu/Education/COSTWorkshop
Additional Resources! • ELIXR http://www.elixir-europe.org/ • Galaxyhttp://galaxyproject.org/ • iPlant http://www.iplantcollaborative.org/
3. Why Functional Modeling? • The typical output from a functional genomics experiment is a list of (differentially) expressed genes/RNAs/proteins. • This list does not represent our current understanding of the system being studied. • Functional modeling is the process we use to view our gene list as: • Biological processes • Pathways • Molecular interactions • Subcellular localization • Examining data in the context of what we know about gene function enables us to group similar products and visualize how the function relates to the phenotypes we observe. data ≠ knowledge
Functional Modeling • Instead of looking at individual genes/gene products, looking at biological functions. • pathways, physiological processes, common molecular functions • moving from molecular back to phenotype • Functional modeling requires: • Data – annotation data (what do the gene products do?) • Tools – software for doing data analysis using annotation data • Biological knowledge – computer can do the analysis but the biologist’s brain must provide context & integration of results.
Genome sequence by itself does not explain what is happening in biological systems.
Annotation • ANNOTATE: to denote or demarcate • Genome annotation is the process of attaching biological information to genomic sequences. It consists of two main steps: • identifying functional elements in the genome: “structural annotation” • attaching biological information to these elements: “functional annotation” Annotation needs to be distributed amongst databases and resources – need to be able to share this data.
Genomic Annotation structural annotation functional annotation
Genomic Annotation Structural Annotation including Sequence Ontology Other annotations using other bio-ontologies e.g. Anatomy Ontology Nomenclature (species’ genome nomenclature committees) Functional annotation using Gene Ontology
Identifying functional elements in the genome does not explain what is happening in biological systems.
List of genes/RNAs/proteins does not explain does not explain what is happening in biological systems.
Genomic Annotation • Genome annotation is the process of attaching biological information to genomic sequences. It consists of two key steps: • identifying functional elements in the genome: “structural annotation” • attaching biological information to these elements: “functional annotation” • Genomic annotation is distributed across different databases. • Databases: have different file formats, annotation pipelines, etc. • Biologists: don’t care (just want their data…) • Problem: how to exchange/share genomic annotations from different databases?
How does annotation add value for biologists? • Annotation enables biologists to move from data to knowledge. • Identify (and include) all genetic components in our analysis (structural annotation) • Model what is happening in our system (functional annotation) • identify the important elements/processes • predict what happens when we make a change to the system • prevent disease, improve agriculture…..
Bio-ontologies • Ontologies first used in biology to enable databases to share & exchange data. • Bio-ontologies are used to capture biological information in a way that can be read by both humans and computers • annotate data in a consistent way • allows data sharing across databases • These same features allow computational analysis of high-throughput “omics” datasets using ontologies.
What Are Ontologies? “An ontology is a controlled vocabulary ofwell defined terms with specified relationships between those terms, capable of interpretation by both humans and computers.” • Bio-ontologies are used to capture biological information in a way that can be read by both humans and computers • annotate data in a consistent way • allows data sharing across databases • allows computational analysis of high-throughput “omics” datasets • Objects in an ontology (eg. genes, cell types, tissue types, stages of development) are well defined. • The ontology shows how the objects relate to each other
Gene Function & Ontologies • Many different ways to describe gene function used by different databases… • How do we standardize this (data sharing, comparative biology)? • Use of ontologies in biology to promote data sharing and anlaysis.
Ontologies relationships between terms digital identifier (computers) description (humans)
How is annotation done? • Annotation is done mostly by biocurators. • Biologists trained to enter data into databases (interpretation) • Biologists – experts! • Biocuration: the activity of organizing, representing and making biological information accessible to both humans and computers. • Biocurators are typically biologists who have cross trained in bioinformatics.
The Annotation Dilemma 1. If you do it right: • it seems easy to do • data is seamlessly integrated with databases & tools researchers ignore the curationprocess 2. If you do it wrong: • it seems easy to do • information is wrong/hard to find researchers complain about the curationprocess (adapted from Ewan Birney)
The Annotation Dilemma • Exponential increase in biological data • More important than ever to provide annotation for this data • How to keep up?
A Combined Annotation Strategy • Manual biocuration of experimental data • Many species have a body of published, experimental data • Detailed, species-specific annotation: ‘depth’ • Will give information about the most commonly studied genes. • Computational sequence analysis • Gives ‘breadth’ of coverage across the genome • Annotations are general: serve as a ‘first pass’ prediction of function (to generate testable experimental hypotheses)
Goals of the Workshop • Enable participants to functionally model their own data. • Provide ongoingsupport to assist with functional modeling. • Request feedback from participants about what resources are required to support their research.