400 likes | 590 Views
Genome Annotation of the Marine Bacterium Cellulophaga lytica. Joanna Klein, Ph.D. Northwestern Scholarship Symposium May 4, 2012. What are Bacteria?. Single celled microorganism Friend or Foe? Friend: health, environment, industry Foe: cause a variety of infectious diseases.
E N D
Genome Annotation of the Marine Bacterium Cellulophagalytica Joanna Klein, Ph.D. Northwestern Scholarship Symposium May 4, 2012
What are Bacteria? • Single celled microorganism • Friend or Foe? • Friend: health, environment, industry • Foe: cause a variety of infectious diseases
Cellulophaga lytica • Marine bacterium • Isolated from beach mud near Limon, Costa Rica in 1969 http://travel.yahoo.com/p-travelguide-482616-limon_vacations-i http://www.infoplease.com/atlas/centralamerica.html
Cellulophaga lytica • Gram negative • Filamentous • Yellow pigmentation • Exhibits gliding motility
Cellulophaga lytica • Member of the Cytophaga-Flavobacterium-Bacteroides (CFB) group of bacteria • Poorly characterized branch
Phylogenetictree of Bacteria • Proteobacteria • E. coli, Salmonella, Bordetella, Helicobacter, Vibrio • Firmicutes • Staphylococcus, Streptococcus, Lactobacillus, Clostridium
Cellulophaga lytica • Target organism in the Genomic Encyclopedia of Bacteria and Archaea (GEBA) Research Program of the Department of Energy/Joint Genome Institute • GEBA organisms • 100 representative organisms from each of the branches • Organisms with potential energy applications
Biofuel production • C. lyticaproduces a variety of enzymes that may have applications in biotechnology and biofuel production
Deconstruction by C. lytica • C. lyticacontains many polysaccharide degrading enzymes • Polysaccharides • Large molecules that store energy or provide structure • Carbohydrates/starches • Cellulose in plant cell walls • Enzymes break down polysaccharides into simple sugars that that can be fermented to produce energy • Polysaccharide degrading enzymes • 3 cellulases • 3 fucoidases • 1 xylosidase
Ethanol production • Ethanol produced as a byproduct of starch degradation and subsequent fermentation • Well developed technology • Enzymes digest starch into simple sugars which are readily fermented by known microorganisms to produce ethanol • Issues…
Cellulosic ethanol production • Goal is to use the cellulose biomass found in plant cell walls of leaves and wood to produce ethanol • Problems to overcome: • Lignin, also found in cell wall, hinders digestion of cellulose from wood • Enzymes that digest cellulose into simple sugars are poorly understood • Organisms that ferment these simple sugars to produce ethanol are poorly understood • Can C. lytichelp achieve this goal?
Why study C. lytica? • Model organism to understand the CFB group better • Contribute to biofuel research and applications
Genome Annotation of Cellulophagalytica • One way to understand more about the life processes of C. lytica is through a study of its genome. • Genome • All of the genetic material, DNA, of an organism • DNA is made up 4 smaller molecules known as the bases A,C,G &T
Sequencing genomes • We can easily determine the entire DNA sequence of an organism – it’s genome. • DNA sequencing technology has developed rapidly since the human genome project, completed in 2003 • Took 13 years to complete, involved 100’s of researchers around the globe, and cost a total of of $2.7 billion • Entire 3 billion base-pair sequence is available in a public database
Genome projects • Currently, there are more than 3000 complete or nearly complete genome sequences of microbes available. • Over 1200 genome sequencing projects in higher organisms (plants, animals, fungi, protists) • The complete genome of Cellulophaga lytica was sequenced by the DOE and published in 2011 • 3,765,936 bases
Computer annotation of C. lytica • Number of genes and predicted function of each gene product.
TATCAAAGAGATGATTGAGAACTGGTACGGAGGGAGTCGAGCCGGGCTCACTTAAGGGCTACGACTTAAC GGGCCGCGTCACTCAATGGCGCGGACACGCCTCTTTGCCCGGGCAGAGGCATGTACAGCGCATGCCCACA ACGGCGGAGGCCGCCGGGTTCCCTGACGTGCCAGTCAGGCCTTCTCCTTTTCCGCAGACCGTGTGTTTCT TTACCGCTCTCCCCCGAGACCTTTTAAGGGTTGTTTGGAGTGTAAGTGGAGGAATATACGTAGTGTTGTC TTAATGGTACCGTTAACTAAGTAAGGAAGCCACTTAATTTAAAATTATGTATGCAGAACATGCGAAGTTA AAAGATGTATAAAAGCTTAAGATGGGGAGAAAAACCTTTTTTCAGAGGGTACTGTGTTACTGTTTTCTTG CTTTTCATTCATTCCAGAAATCATCTGTTCACATCCAAAGGCACAATTCATTTTGAGTTTCTTTCAAAAC AAATCGTTTGTAGTTTTAGGACAGGCTGATGCACTTTGGGCTTGACTTCTGATTACCCTATTGTTAAATT AGTGACCCCTCTTAGTGTTTTCCTGTCCTTTATTTCGGAGGACGCACTTCGAAGATACCAGATTTTATGG GTCATCCTTGGATTTTGAAGCTTATAACTGTGACAAAAAATGTGAAGGGAAGAGATTTGAAACATGTGGA AGGAAAAGTGAGTGCAGACTATAAACTTCCAAAAAGACAAGCCCAAAATACACCTAAACGTTATGTCAGA TTATTTTGTTAAAATCAGTTGTTAGTGACGTCCGTACGTTAATAGAAAAAAGAATGCTTCAGTTTGGAGT GGTAGGTTTCTAGAGGGATTTATTGTGAAAGTATAAACTATTCAGGGCAATGGGACTGAGAGAACAGTGG GTAGAAAGGACCACTGAAGGAAAGGAAGAGAATTGGAAGGTAGATGAAAGAAGGAGCAAGAACCTGGGGTGTTTTTTCCTTTTCACTTGTAATAGTAGTAACAGAAGCAATGGCAGACTGGCTTTTGTTTCTACTGTGT TAGAATGAATTGACAGGACAACTGGGCCTATTATTGTACTGTGCCAGAATACTGTAAAACAAAACTAAAC ATACTAGCTTGGTGGCTTGTAATTAATTACTTAAGTGGAGATTTTTATTTTTTTTTTATTTTTTTTTTAG ACGGAGTCTCACTTTGTCACCCAGGCTGGAGTGCAGTGGCGCGATCTCAGCTGACTGCAACCTCCTCCTC Cellulase
Process of annotation • Automatic annotation - done automatically using computer software • 35% of computer generated annotations are wrong or are missing information due to limitations of computer algorithms • Manual Annotation – humans analyze the information generated by computers and make corrections as necessary. • Labor intensive and time consuming • Solution: Train students to participate in the process
IMG-ACT is a toolkit of online gene and genome analysis programs. • Using IMG-ACT, students annotate genomes • provide human expertise necessary for accurate, up-to-date, reliable annotation • Students contribute to the scientific community and learn biological concepts through participating in original research
JGI Genome Annotation Workshop Walnut Creek, CA January 2011
Genome annotation of C. lytica at NWC • 39 NWC students have participated in this research endeavor • Science Research Institute, Summer 2011 • Genetics, Fall 2011 • Microbiology, Spring 2012 • 15 genes have been fully annotated • 10 genes have been partially annotated
Restriction endonuclease type I • What is the amino acid sequence of the protein encoded by this gene? • Used Integrated Microbial Genomes (IMG) database • Amy Knight and Allison Lothe
DNA topoisomerase III • How does this protein compare to the sequence of other proteins? • Used BLAST program • Libby Nelson and Chelsey Fiecke
RNA polymerase sigma subunit 24 • What are key functional amino acid residues in the protein? • Web Logo Program • Silas Baalke and Laura Torgerson
DNA Replication Protein A • What enzymatic pathway is the protein involved in? • Used KEEG Pathway database • Marie Abeler and Gabe Jefferson
b-galactosidase • What pathway is this enzyme found in? • KEEG database • Daniel Plack, Michael Lowry
Prolyl-tRNA synthase • What is the 3D structure of similar proteins? • ProteinDataBank (PDB) • Sarah Ivanca and Victoria Hanson
NusA, B, G anti-termination factors • Where is the gene in relation to other genes? • Used Gene neighborhood feature of IMG • Matt Takata and Zach Fredman
RNase H • What reaction does the enzyme catalyze? • Used Metacyc database • Chelsey Fiecke
Elongation Factor Ts • How closely related is this protein to proteins in other bacteria? • Used Phylogeny FR program • Ellen Chae, Holly Tomaz
Cytochrome C oxidase subunit 3 • Where is this protein located in the cell? • TMHMM algorithm • Alannah Pratt, Michael Lowry, SRI high school students
MutS • Are there paralogs of this gene? • IMG database query • Ryan Bradbury and Luke Delain
RNA polymerase sigma-70 factor • Was this gene named properly? • Multiple lines of evidence used to change name to RNA Polymerase anti-sigma 70 factor • Camaren Terrill and Ben Sorenson
Future work • 3,348 genes left to annotate! • Special interest in: • Polysaccharide degrading enzymes • Motility proteins • Proteins with unknown function • Study the function of interesting genes in the lab
Acknowledgements • NWC students who have participated in this research. • Genetics, Microbiology and SRI courses • Research students Steven Erickson and Andy Jaeger • Northwestern College for providing the opportunity and support for the sabbatical during which this project was initiated. • Additional funding received from a 2010-2011 Faculty Development Grant