180 likes | 300 Views
http://creativecommons.org/licenses/by-sa/2.0/. Metagenomics. Prof:Rui Alves ralves@cmb.udl.es 973702406 Dept Ciencies Mediques Basiques, 1st Floor, Room 1.08 Website of the Course:http://web.udl.es/usuaris/pg193845/Courses/Bioinformatics_2007/ Course: http://10.100.14.36/Student_Server/.
E N D
Metagenomics Prof:Rui Alves ralves@cmb.udl.es 973702406 Dept Ciencies Mediques Basiques, 1st Floor, Room 1.08 Website of the Course:http://web.udl.es/usuaris/pg193845/Courses/Bioinformatics_2007/ Course: http://10.100.14.36/Student_Server/
Studying an organism …ACTG… Stress >Dna MAACTG… >DNA Pol MTC… Measure Response Find signatures for physiological dynamics in genomic data
Diversity of Life on Earth • Described species: ~1.5 millions • Predicted to exist: >30 millions • Cultivate in the lab: ~thousands • How do we know the genome of the species we can not cultivate? • How can we know if the genes that are expressed in nature follow the same patterns as those in the lab?
Metagenomics • Metagenomics (also Environmental Genomics, Ecogenomics or Community Genomics) is the study of genetic material recovered directly from environmental samples.
Sampling in Metagenomics • Take a sample off of the environment • Isolate and amplify DNA/mRNA • Sequence it
Shotgun Sequencing Restriction Enzymes
ACT…GTC CTA …ATC … …GGGG Computer assembly How do we know which genes belong to which genome???? How do we assemble them???
The Best Case Scenario Coverage is enough to assemble independent genomes
What normally happens Coverage is not enough and assembly is fragmentary Worst Case Scenario: Some fragments can not be assigned
Down Side of Metagenomics • Often fragmentary • Often highly divergent • Rarely any known activity • No chromosomal placement • No organism of origin • Ab initio ORF predictions • Huge data
Marine Metagenomics • Microbes account for more than 90% of ocean biomass, mediate all biochemical cycles in the oceans and are responsible for 98% of primary production in the sea. • Metagenomics is a breakthrough sequencing approach to examine the open-space microbial species without the need for isolation and lab cultivation of individual species.
PI Larry Smarr Paul Gilna Ex. Dir. PI Larry Smarr
Marine Genome Sequencing ProjectMeasuring the Genetic Diversity of Ocean Microbes Sorcerer II Datafrom this area has already reach to 10% of GenBank. The Entire Data Will Double Number of Proteins in Embank!
Sample Metadata from GOS • Site Metadata • Location (lat/long, water depth) • Site characterization (finite list of types plus “other”) • Site description (free text) • Country • Sampling Metadata • Sample collection date/time • Sampling depth • Conditions at time of sampling (e.g., stormy, surface temperature) • Sample physical/chemical measurements (T (oC), S (ppt), chl a (mg m-3), etc) • “author” • Experimental Parameters • Filter size • Insert size
Dedicated Compute Farm (1000 CPUs) W E B PORTAL Data- Base Farm 10 GigE Fabric Local Environment Flat File Server Farm Direct Access Lambda Cnxns Web (other service) Local Cluster TeraGrid: Cyberinfrastructure Backplane (scheduled activities, e.g. all by all comparison) (10000s of CPUs) Calit2’s Direct Access Core Architecture Will Create Next Generation Metagenomics Server • Sargasso Sea Data • Sorcerer II Expedition (GOS) • JGI Community Sequencing Project • Moore Marine Microbial Project • NASA Goddard Satellite Data • Community Microbial Metagenomics Data Traditional User Request Response + Web Services Source: Phil Papadopoulos, SDSC, Calit2
Marine Metagenomics Metabolic pathway discovery Drug discovery Microbial genetic survey Environmental survey Symbiosis Who is there? Evolution study Endosymbiosis Organism discovery Bioenergy discovery Microbial genomic survey Biogeochemistry mapping Marine conservation