1 / 49

High Throughput Profiling of Prokaryotic Species

High Throughput Profiling of Prokaryotic Species. Joachim De Schrijver joachim.deschrijver@ugent.be Vakgroep Wiskundige Modellering, Statistiek en Bio-informatica. Overview. Sequencing technology Roche/454 GS-FLX (‘454’) Illumina Prokaryotic profiling De novo genome sequencing

Download Presentation

High Throughput Profiling of Prokaryotic Species

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. High Throughput Profiling of Prokaryotic Species Joachim De Schrijverjoachim.deschrijver@ugent.be Vakgroep Wiskundige Modellering, Statistiek en Bio-informatica

  2. Overview • Sequencing technology • Roche/454 GS-FLX (‘454’) • Illumina • Prokaryotic profiling • De novo genome sequencing • Metagenomics • SNP profiling • Species quantification • Viral profiling • De novo genome sequencing

  3. Sequencing technology Classic chain-terminator sequencing Dye chain-terminator sequencing Next-generation sequencing

  4. Sequencing technology • Next-gen sequencing principle • Massive parallel • Add ACTGs • Catch a signal

  5. Sequencing technology • Roche/454 GS-FLX+ (‘454’) • Pyrosequencing • problems with homopolymers (e.g. AAAAAA) • Long-read sequencing: 500-1000 bp • Variable sequencing length • 1 million reads/run 1Gb/run • Sequencing speed: ~ 1 day/run • Next-next generation: IonTorrent PGM/Proton

  6. Sequencing technology • Illumina • Sequence by synthesis • Short-read sequencing: 36, 72, …, 150bp • Fixed sequencing length • 1 billion reads/run • 100Gb/run (= 33 x human genome!) • Sequencing speed: 3 day – 10 days ~ length • Solid • Short-read sequencing (similar to Illumina)

  7. Sequencing technology • 454 • Illumina

  8. Sequencing technology • Price per run: $10000/run • Price per machine: $200-500.000 • Supporting IT hardware • Peripheral devices such as fragmentation instrument, PCR equipment … • Negotiating power… • Use service centers! • Nxtgnt (BE), GATC(EU), Baseclear(NL), BGI … • No overhead cost, no maintenance etc. • Cheaper

  9. Sequencing technology • Next-generation sequencing has become 2nd generation sequencing • Next-next-generation sequencing is almost there: 3rd generation sequencing • Helicos: True Single Molecule Sequencing • IonTorrent/Life: Cheap and fast • Nanopore: Unlimited read size • …

  10. Sequencing technology • Evolution sequencing technology goes hand in hand with evolution of • IT infrastructure/hardware • Analysis software • Hardware • 1 Illumina run ~ 100Gb text-file ~ 5million page book • Processing power/storage are an issue! • Software • Mapping to a human genome: ‘couple of hours’

  11. Overview • Sequencing technology • Roche/454 GS-FLX (‘454’) • Illumina • Prokaryotic profiling • De novo genome sequencing • Metagenomics • SNP profiling • Species quantification • Viral profiling • De novo genome sequencing

  12. Prokaryotic profiling • Prokaryotic genomics 101 • Prokaryotes = bacterias + archaea • Prokaryotic genomes • Large circular genome (0.5 – 10 Mb) ‘chromosome’ • Small plasmids (1-1000 kb) (virulence factors, antibiotics resistance …) • (Almost) no introns • Easy ORF annotation

  13. Overview • Sequencing technology • Roche/454 GS-FLX (‘454’) • Illumina • Prokaryotic profiling • De novo genome sequencing • Metagenomics • SNP profiling • Species quantification • Viral profiling • De novo genome sequencing

  14. Prokaryotic profiling: de novo genome sequencing • 1953: Watson/Crick discover DNA helix • 1977: First complete genome bacteriophageφX174 • 1995: First genome of free-living organism H. influenza • 2001: First draft of the human genome • 2006: >200 complete bacterial genomes • 2012: An uncountable number of bacterial genomes have been sequenced using next-gen sequencing

  15. Prokaryotic profiling: de novo genome sequencing • Complete bacterial genomes used to be • Expensive • Difficult to obtain • ‘Nature’ or ‘Science’ work • Remained complex until the invention of next-generation sequencing

  16. Prokaryotic profiling:de novo genome sequencing • Using next-generation sequencing, de novo sequencing has become • Relatively easy • Relatively cheap • Routine research • Already >10 complete bacterial genomes published in 2012 • More than just an assembly!

  17. Prokaryotic profiling: de novo genome sequencing • Practical • Get some DNA from an isolated species of interest • Sequence: long or short reads (1-10 days) • Obtain your sequences • Assemble (1h) • Pure de novo assembly • Guided assembly • Annotate the genome (days-weeks)

  18. Prokaryotic profiling: de novo genome sequencing • Assembly: Multiple ‘short’ reads 1 long sequence • Existing software • Velvet • SSAKE • Newbler • SSAKE • … Source: Nature 2009, MacLean et al.

  19. Prokaryotic profiling: de novo genome sequencing • Relatively cheap • Sequencing cost: depending on coverage • Illumina, 30x, 5Gb genome: $10-$100 • 454, 30x, 5Gb genome: $1000-$5000 • Equipment • IT infrastructure, sequencing equipment, people … • Relatively easy • Need for IT support • No out-of-the-box standard solution for everything • Several different software packages for assembly

  20. Overview • Sequencing technology • Roche/454 GS-FLX (‘454’) • Illumina • Prokaryotic profiling • De novo genome sequencing • Metagenomics • SNP profiling • Species quantification • Viral profiling • De novo genome sequencing

  21. Prokaryotic profiling:Metagenomics • De novo genome assembly • Study of 1 single species • Need for species isolation • Metagenomics analysis • Study of a community of species • No need for isolation (culturing bias!) • Study the collective gene pool and function of the community/ecology • No need for individual functions

  22. Prokaryotic profiling:Metagenomics • Practical • Get bacterial DNA or RNA from a sample • Soil • Gut/Fecal • Ocean water (e.g. Craig Venter) • … • Sequence: long or short reads (1-10 days) • Obtain your sequences • Map on a database of known genes (1 day) • Annotate/analyse the community (weeks)

  23. Prokaryotic profiling:Metagenomics

  24. Prokaryotic profiling:Metagenomics • 2010: Giant Panda genome (2nd carnivore) • No umami taster receptor -> no meat affinity • The panda is more a dog than a bear • The panda is a carnivore eating bamboo!

  25. Prokaryotic profiling:Metagenomics • Still 2010 !: Panda ‘microbiome’ • Gut microbiome of the panda reveals the presence of bamboo/cellulose degrading pathways

  26. Prokaryotic profiling:Metagenomics

  27. Prokaryotic profiling:Metagenomics • A clinical example: gut microbiome can predict diabetes and malnourishment • Plos One (2011), Brown et al. Plos One (2010), Valladares et al. Gut Pathology (2011),Gupta et al.

  28. Overview • Sequencing technology • Roche/454 GS-FLX (‘454’) • Illumina • Prokaryotic profiling • De novo genome sequencing • Metagenomics • SNP profiling • Species quantification • Viral profiling • De novo genome sequencing

  29. Prokaryotic profiling:SNP profiling • Classical SNP analysis - practical • Design PCR primers • Generate amplicons • Re-sequence using long read sequencing • Conserve ‘SNP blocks’ • Detect SNPs • Correlate SNPs to drug resistance, severity of symptoms …

  30. Prokaryotic profiling:SNP profiling • Amplicon resequencing is the same for human, prokaryotic, viral analyses • Many standardized out-of-the-box solutions available • Very simple analysis • Watch out for the overkill… • Don’t use a bazooka to kill a fly! • Throughput can be too high

  31. Prokaryotic profiling:SNP profiling

  32. Prokaryotic/Viral profiling:SNP profiling • Profile the coding region of hepatitis C Lauck et al. 2012

  33. Prokaryotic profiling:SNP profiling • Use next-generation sequencing to predict the optimal HIV therapy Thielen et al. 2012

  34. Overview • Sequencing technology • Roche/454 GS-FLX (‘454’) • Illumina • Prokaryotic profiling • De novo genome sequencing • Metagenomics • SNP profiling • Species quantification • Viral profiling • De novo genome sequencing

  35. Prokaryotic profiling:Species quantification • Imagine the following research questions • Which (known) species/groups are present in a certain sample • Does this composition alter given a certain treatment, change of conditions, patients etc. • No need for de novo genome sequencing • No metagenomics: species instead of functions

  36. Prokaryotic profiling:Species quantification • Prokaryotes have the gene 16S rDNA, coding for ribosomal RNA • The 16S rDNA region is 1.5 kb long • 16S rDNA is specific for each species/strain • Theoretical: 41,500= 10903 possibilities • In practice: 16S rDNA sequence known for millions of species

  37. Prokaryotic profiling:Species quantification • 16S rDNA can be isolated in different species using universal PCR primers • Isolate/amplify different regions using the same primers • Compare the isolated sequences against a database of known sequences

  38. Prokaryotic profiling:Species quantification • Practical procedure • Sample an environment and isolate DNA • Do a universal PCR amplification • Sequence using long read sequencing: the longer the better! • Obtain sequences • Map sequences against a reference database • Annotate the data

  39. Prokaryotic profiling:Species quantification • Example: The Antarctica project • Which parameters determine the composition of bacterial communities in antarctical lakes? • 20 different samples/lakes • Sequence 16S rDNA genes • 1 x 454 run (1 million 500bp sequences) • Map all sequences back to the RDP database

  40. Prokaryotic profiling:Species quantification • Analyse the data using computing power • Compare different locations • Is species A present in location1, location2,… • Assess the distribution in a single location • How dominant is the most dominant species in location 1 • How many species are in location 1 • … • Visualize !

  41. Prokaryotic profiling:Species quantification • Analyse different samples on different taxonomic levels • Include taxonomic tree of life of bacterias • Use a ‘taxonomy browser’

  42. Prokaryotic profiling:Species quantification • Analyse a single location

  43. Prokaryotic profiling:Species quantification • Compare different locations

  44. Overview

  45. Overview • Sequencing technology • Roche/454 GS-FLX (‘454’) • Illumina • Prokaryotic profiling • De novo genome sequencing • Metagenomics • SNP profiling • Species quantification • Viral profiling • De novo genome sequencing

  46. Viral profiling • Viral profiling • Viral profiling = prokaryotic profiling, but… • Cheaper • Faster • Easier • De novo genome sequencing = OK • Don’t spend $10.000 on a 100kb genome! • Multiplexing/pooling capacity is limited!

  47. Viral profiling • Watch out for the overkill • An illumina run can be split into 8 lanes • >20 samples per lane can be combined • Still >100Mb per sample…

  48. Thanks for your attention !

  49. Questions joachim.deschrijver@ugent.be

More Related