160 likes | 398 Views
What is Annotation?. Microbial Genome Annotation Network Training Workshop (supported by NSF RCN-UBE grant). Some Basic Definitions. We all feel comfortable with genes and genomes, but what is annotation?
E N D
What is Annotation? Microbial Genome Annotation Network Training Workshop (supported by NSF RCN-UBE grant)
Some Basic Definitions • We all feel comfortable with genes and genomes, but what is annotation? • Genome annotation is the process of interpreting sequence data using biological information • IMG-ACT focuses on two specific types of annotation
Types of Annotation Gene annotation Gene product Protein structure Protein properties Functional role Phylogenetic origin
Types of Annotation Pathway/Structure Pathway components present Alternate pathway steps Corrected gene annotation for steps Metabolic/physiological capabilities of the microbe
Why Annotate with Students? • Most automated genome annotations - 35% are wrong • Automated annotations miss things…. (especially in GEBA microbes) • The logic of bioinformatic algorithms illustrates key principles of biology
Why Annotate More Genomes? Most of what we know comes from a relatively small subset of life’s diversity. Does this subset adequately reflect genomic diversity? Bacteria Archaea Figure courtesy of Phil Hugenholtz
GEBA Genomes Genomic Encyclopedia of Bacteria & Archaea (GEBA) is a massive JGI genome sequencing effort to fill in many of the missing or under-sampled branches of the Bacteria & Archaea trees. *T.P. Curtis, W.T. Sloan, and J.W. Scannell. 2002. Estimating prokaryotic diversity and its limits. Proc Natl Acad Sci USA 99: 10494-10499.
GEBA Genomes First 56 GEBA genomes* filled in several missing or under-sampled branches of the Bacteria trees & showed that there is a lot of genomic diversity out there to be discovered. * D. Wu, P. Hugenholtz, K. Mavromatis, et al., 2009. A phylogeny-driven genomic encyclopedia of Bacteria and Archaea. Nature 462: 1056-1060.
Genomes to Metagenomes Community Isolate sequencing Genomics Metagenomics
Develop and strengthen basic scientific research skills such as: Reading and evaluating primary literature Developing hypotheses and interpreting data Drawing conclusions from a collection of evidence Working collaboratively Maintaining a laboratory notebook Working with real data Presenting results - orally and/or in writing Annotation & Course Goals
Develop and strengthen genome annotation skills such as: Using computer programs to analyze sequence data Gathering and evaluating information from Web-based community-accessible sequence databases Constructing phylogenetic trees Evaluating automated gene calls Reconstructing a metabolic pathway Annotation & Course Goals
Annotation & Course Goals • Produce quality annotations for incorporation into the Integrated Microbial Genomes Database • Build and strengthen a conceptual understanding of: • Evolutionary relationships among genomes • Gene-protein-pathway interactions • Genome organization • Metabolic diversity • Operon structure • Power and limitations of bioinformatics • Protein structure and function • Transcriptional and translational signals
Streamline annotation Emphasizes biological root of bioinformatics More easily compatible with education Emphasizes complementarity of tools Allows addition and removal of modules to match student level IMG-ACT Modular Annotation
Basic Information Sequence-based Similarity Data Cellular Localization Alternative Open Reading Frame Structure-based Evidence Enzymatic Function Gene Duplication and Degradation Horizontal Gene Transfer RNA IMG-ACT Modules
Module Concepts Basic Information
For More Information IMG-ACT (JGI): Cheryl Kerfeld (ckerfeld@lbl.gov) Seth Axen (saxen@lbl.gov) www.jgi.doe.gov/education/annotation_tools.html Microbial Genome Annotation Network: (NSF RCN-UBE) Lori Scott, PI (LoriScott@augustana.edu) mgan.jgi-psf.org