1 / 24

Sequencing All of Microbial Life: Challenges and Opportunities

Sequencing All of Microbial Life: Challenges and Opportunities. Rob Edwards Argonne National Laboratory San Diego State University. How much has been sequenced. 100 bacterial genomes. Environmental sequencing. Number of known sequences. First bacterial genome. 1,000 bacterial

mark-barnes
Download Presentation

Sequencing All of Microbial Life: Challenges and Opportunities

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Sequencing All of Microbial Life: Challenges and Opportunities Rob Edwards Argonne National Laboratory San Diego State University

  2. How much has been sequenced 100 bacterial genomes Environmental sequencing Number of known sequences First bacterial genome 1,000 bacterial genomes Year

  3. How much will be sequenced Everybody in North America Everybody in Toronto One genome from every species 100 people Most major microbial environments All cultured Bacteria

  4. Rank Abundance Curves, Papers vs Genomes • Microbial publications vs Genomes by Family

  5. 16S Abundance -- Human Intestine

  6. 16S Abundance -- Upland Pasture Soil

  7. Environmental Genomics -- Wisconsin Soil

  8. Line Island Metagenomics Transect

  9. Environmental Genomics -- Whale Fall

  10. There are big gaps in sequence space • 6,400 total taxa • About 380 are human, animal or plant pathogens • 360 complete prokaryotic genomes published • 56 archaeal and 940 bacterial genomes in progress • ~400 are pathogens • Approximately ~5,000 prokaroytes not yet in play • We estimate about 4,800 non-pathogen taxa

  11. The Bergey’s Manual David H. Bergey

  12. Strain Distribution in Collections US Collections / BRCs Strains American Type Culture Collection (ATCC) 4027 USDA ARS Collection (NRRL) 223 European Collections Deutsche Sammlung vor Microoransmen (DSMZ) 1302 Culture Collection University Gottenberg (CCUG) 183 Pasteur Institute (CIP) 170 Laboratory for Micrbiology, Gent (LMG) 101 National Collection of Industrial and Marine Bacteria 25 French Collection of Phytopathogens (CFPB) 15 National Collection of Type Cultures (NCTC) 12 National Collection of Phytopathogenic Bacteria 11 Asia Japan Collection of Microorganisms (JCM) 185 Institute of Fermentation, Osaka (IFO) 34 Korean Collection of Type Cultures (KCTC) 28 Institute of Applied Microbiology, Tokyo (IAM) 26 National Institute of Technology And Evaluation (NBRC) 24 All-Russian Collection of Microorganisms (VKM) 13

  13. Estimated Sequencing Rates Target Selection Type Culture Material Sequencing Assembly Rapid Annotation (24 Hours) Phenotype Microarrays Metabolic Reconstruction

  14. Target Selection http://www.sequencingbergeys.org

  15. Microbial Idol

  16. Culturing by ATCC • >2,000 different media • Physical Conditions: • Temperature (4° - 120°C) • pH (1.0 - 11.0) • Salt (0 - 30%) • Light (obligate phototrophs • Pressure (few obligate piezophiles) • Redox: •  Strict anaerobes •  Facultative •  Microaerobes •  Aerobes

  17. Phenotyping by Biolog P Biosynth.Pathways Carbon Pathways N S pHEffects Osmotic &Ion Effects Nitrogen Pathways Sensitivity to Chemicals

  18. Sequencing by JGI FY 06: # Instruments Sanger: 107 454: 1 35.4 Gb FY 07: # Instruments Sanger: 107 454: 2 45 Gb goal

  19. Rapid Annotation Using Subsystems Technology http://www.nmpdr.org/anno-server/index48.cgi • Automated process consisting of: • Gene calling • Initial annotation of function • Initial metabolic reconstruction • Process takes 1-7 hours depending on size and complexity of the genome • ~20 genomes per day

  20. Evaluation / Viewing

  21. Feedback Target Selection Phenotyping Sequencing Metabolic Reconstruction Annotation

  22. Status • 100 organism pilot - GEBA underway • Requesting funding/approval for remainder • Target selection about to go live

  23. People MSU Jim Cole George Garrity U GA Barney Whitman UIUC Gary Olsen ATCC David Emmerson Tim Lilburn Biolog Stacy Montgomery John Groat JGI Jim Bristow Jonathan Eisen Phil Hugenholtz Nikos Kyrpides Paul Richardson David Bruce ANL Rick Stevens Folker Meyer Ross Overbeek Veronika Vonstein Hope Matt DeJongh

  24. Technical Feasibility FAQ • How many genomes would the project propose to sequence? • About 5000 • Who would produce the biomass needed for DNA extraction? • Type culture centers • Will the biomass/DNA be available for distribution? • Yes, both the DNA and the libraries could be stored for distribution • What throughput is needed for DNA production? • In the beginning of the project ~300 taxa per year to 2000 per yr at the end • What throughput is needed for sequencing? • 1.2 Gb/yr to 8 Gb/yr finished sequence • What combinations of sequencing technologies need to be employed? • Sanger and Pyrosequencinginitially • What throughput is needed for annotation? • 24 hour turnaround from assembled sequence to initial availability • Is is possible to have a standard set of phenotype assays given the broad spectrum of organisms and conditions? • We are considering Biolog as a model, but it is too limited • How would the genomes be selected and prioritized? • At each cycle we choose genomes (e.g. via 16s) to minimize the diversity gaps • Is it necessary to “close” the genomes? • We think no.

More Related