110 likes | 260 Views
TM6. Proteobacteria. Nitrospira. OS-K. Termite Group. OP8. Chlorobi. Marine GroupA. WS3. Synergistes. Firmicutes. Deferribacteres. Chrysiogenetes. Fusobacteria. OP9. Actinobacteria. Cyanobacteria. NKB19. Coprothmermobacter. OP3. Thermomicrobia. Chlamydia. Spriochaetes.
E N D
TM6 Proteobacteria Nitrospira OS-K Termite Group OP8 Chlorobi Marine GroupA WS3 Synergistes Firmicutes Deferribacteres Chrysiogenetes Fusobacteria OP9 Actinobacteria Cyanobacteria NKB19 Coprothmermobacter OP3 Thermomicrobia Chlamydia Spriochaetes Dictyoglomus OP10 Thermudesulfobacteria TM7 Deinococcus-Thermus Aquificae OP1 Thermotogae OP11 • GEBA • A genomic encyclopedia of bacteria and archaea Acidobacteria • At least 40 phyla of bacteria • Genome sequences are mostly from three phyla • Some other phyla are only sparsely sampled • Solution: Really Fill in the Tree Bacteroides Fibrobacteres Gemmimonas Verrucomicrobia Planctomycetes Chloroflexi Eisen & Ward, PIs
GEBA Pilot Project Overview • Identify major branches in rRNA tree for which no genomes are available • Identify those with a cultured representative in DSMZ • DSMZ grew > 200 of these and prepped DNA • Sequence and finish 100+ (covering breadth of bacterial/archaea diversity) • Annotate, analyze, release data • Assess benefits of tree guided sequencing • 1st paper Wu et al in Nature Dec 2009
GEBA Phylogenomic Lessons * The rRNA Tree of Life is a Useful Tool for Identifying Phylogenetically Novel Genomes * Phylogeny-driven genome selection helps discover new genetic diversity * Phylogeny driven genome selection (and phylogenetics in general) improves genome annotation * Improves analysis of genome data from uncultured organisms (not by too much)
Organism Selection Method I MaxPD : Select organisms so the phylogentic diversity is maximized on a 16S rRNA tree
Organism Selection Method II MCL Clustering: divide organisms in a phylogenetic group in subgroups CLUSTER_56 number of sequences=3 genome representatives=0 (10934,10867,237295) Desulfobotulus sp. str. BG14 Desulfocella halophila str. GSL-But2 DSMZ:DSM11763TYPE STRAIN Desulfocella sp. str. DSM 2056 DSMZ:DSM2056 CLUSTER_57 number of sequences=3 genome representatives=1 (10775,10774,71864) Desulfoarculus sp. str. BG74 Desulfovibrio baarsii str. 2st14 Desulfovibrio baarsii str. DSM 2075 DSMZ:DSM2075Gi03014TYPE STRAIN