1 / 15

Kayo Arima

Cyber Metagenomics; Challenge to S ee T he U nseen M ajority in T he O cean. Kayo Arima California Institute for Telecommunications and Information Technology (Calit2)-University of California, San Diego Division. Looking Back Nearly 4 Billion Years In the Evolution of Microbe Genomics.

Download Presentation

Kayo Arima

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Cyber Metagenomics; Challenge to See The Unseen Majority in The Ocean Kayo Arima California Institute for Telecommunications and Information Technology (Calit2)-University of California, San Diego Division

  2. Looking Back Nearly 4 Billion YearsIn the Evolution of Microbe Genomics Eukaryote has the nuclei . Prokaryotes has genes but no nuclear membrane. Science Falkowski and Vargas 304 (5667): 58

  3. Evolution is the Principle of Biological Systems:Most of Evolutionary Time Was in the Microbial World You Are Here Much of Genome Work Has Occurred in Animals Source: Carl Woese, et al

  4. Two completely different approach to get microbial genomic information Source: Karin Remington J. Craig Venter Institute Microbial whole genomics Metagenomics Environmental sample Culture (grow) in lab Isolate the colony Culture the isolated colony DNA extraction Enz. digestion Shotgun sequencing Gene assembly Environmental sample DNA extraction Enz. digestion Shotgun sequencing Scaffold assembly

  5. Down Side of Metagenomics Often fragmentary Often highly divergent Rarely any known activity No chromosomal placement No organism of origin Ab initio ORF predictions Huge data

  6. Genomic Data Is Growing Rapidly, But Metagenomics Will Vastly Increase The Scale… 100 Billion Bases! 35,000 Structures Protein Data Bank GenBank www.rcsb.org/pdb/holdings.html www.ncbi.nlm.nih.gov/Genbank Total Data < 1TB

  7. Full Genome Sequencing is Exploding:Most Sequenced Genomes are Bacterial First Genome 1995 6 Genomes/ Year 2000 Ongoing Genomes Completed Genomes 90 Metagenomes Total 422 Total 1665 www.genomesonline.org

  8. Marine Metagenomics • Microbes account for more than 90% of ocean biomass, mediate all biochemical cycles in the oceans and are responsible for 98% of primary production in the sea. • Metagenomics is a breakthrough sequencing approach to examine the open-space microbial species without the need for isolation and lab cultivation of individual species.

  9. PI Larry Smarr Paul Gilna Ex. Dir. PI Larry Smarr

  10. Marine Genome Sequencing ProjectMeasuring the Genetic Diversity of Ocean Microbes Sorcerer II Datafrom this area has already reach to 10% of GenBank. The Entire Data Will Double Number of Proteins in Embank!

  11. Sample Metadata from GOS • Site Metadata • Location (lat/long, water depth) • Site characterization (finite list of types plus “other”) • Site description (free text) • Country • Sampling Metadata • Sample collection date/time • Sampling depth • Conditions at time of sampling (e.g., stormy, surface temperature) • Sample physical/chemical measurements (T (oC), S (ppt), chl a (mg m-3), etc) • “author” • Experimental Parameters • Filter size • Insert size

  12. Calit2’s Direct Access Core Architecture Will Create Next Generation Metagenomics Server Sargasso Sea Data Sorcerer II Expedition (GOS) JGI Community Sequencing Project Moore Marine Microbial Project NASA Goddard Satellite Data Community Microbial Metagenomics Data Traditional User Dedicated Compute Farm (1000 CPUs) W E B PORTAL Request Data- Base Farm 10 GigE Fabric Response + Web Services Local Environment Flat File Server Farm Direct Access Lambda Cnxns Web (other service) Local Cluster TeraGrid: Cyberinfrastructure Backplane (scheduled activities, e.g. all by all comparison) (10000s of CPUs) Source: Phil Papadopoulos, SDSC, Calit2

  13. Marine Metagenomics Metabolic pathway discovery Drug discovery Microbial genetic survey Environmental survey Symbiosis Who is there? Evolution study Endosymbiosis Organism discovery Bioenergy discovery Microbial genomic survey Biogeochemistry mapping Marine conservation

More Related