290 likes | 485 Views
Web Valley 2014 16S sequencing for microbiome studies. Nicola Segata and Nick Loman Principal Investigator Laboratory of Computational Metagenomics Centre for Integrative Biology University of Trento Italy. The human microbiome. 10x more microbial t han human cells
E N D
Web Valley 2014 16S sequencing for microbiome studies Nicola Segata and Nick Loman Principal Investigator Laboratory of Computational Metagenomics Centre for Integrative Biology University of Trento Italy
The human microbiome • 10x more microbial than human cells • 1M times as many microbes inside each of us than humans on earth • 100x more microbial than human genes Nature 486(7402) Who’s there? What are they doing? Scientific American, May 2012 Metagenomics: Study of uncultured microorganisms from the environment, which can include humans or other living hosts Focus on taxonomic and functional characteristics of the total collection of microorganisms within a community Main experimental tool is high-throughput sequencing: ~10M short (~100nt) reads per dataset
16S sequencing Liu, Bo, et al. "Accurate and fast estimation of taxonomic profiles from metagenomic shotgun sequences." BMC genomics 12.Suppl 2 (2011): S4. • PROS: • Cost-effective • Avoids non-bacterial contamination • The resulting dataset is reasonable in size and complexity • Mature analysis software available • Can potentially catch low abundance bacteria • CONS: • Not genome-wide (so no metabolic potential) • Limited taxonomic resolution • Not effective for pathogen profiling • Cannot catch viruses and eukaryotes • Several (usually underestimated) biases • Almost impossible cross-study comparisons
16S-based “metagenomics” V6 George Rice, Montana State University PCR to amplify the single16S rRNA marker gene Samples Classify sequence microbe Microbes Counts V2
The ribosome Ribosomes are the universal machinery that translate the genetic code into proteins. • The ribosomal machinery is composed by: • Two subunits • several proteins • mRNAs • tRNAs • rRNA (5S, 16S, 23S)
The 16S rRNA Center for Molecular Biology of RNA, University of California
The 16S rRNA gene 1/3 This annotation has been performed on a representative E. coli 16S sequence Baker, G. C., J. J. Smith, and Donald A. Cowan. JMMs 55.3 (2003): 541-555.
The 16S rRNA V6 V7 V7 V6 V4 V8 V4 V5 V8 V5 V3 V3 V1 V9 V1 V9 V2 V2 Center for Molecular Biology of RNA, University of California
The 16S gene: statistical view of the variable regions Variability within the 16S rRNA gene Andersson, Anders F., et al. " PloS one 3.7 (2008) V6 V3 • Which HTM would you choose? • 454 historically well suited (~400nt reads 3 regions), good cost/throughput trade-off • Illumina (HiSeq) is not optimal (shorter reads, unnecessary high throughput) • Illumina MiSeq and IonTorrent can be a nice compromise. V2 V5 V9 V4 V8 V1 V7 Claesson, Marcus J., et al. Nucleicacidsresearch 38.22 (2010) Multiple variable regions can be targeted simultaneously (if you have long enough reads!)
Which HTM would you choose? Throughput Very low (~1 seqs / sample) Medium (~3k seqs / sample) High (~50k seqs / sample)
One of the challenges: which technology? http://flxlexblog.files.wordpress.com/
One of the challenges: which technology? MolEcolResour. 2011 Sep;11(5):759-69
One of the challenges: which technology? MolEcolResour. 2011 Sep;11(5):759-69
In silico primer validation/testing The idea: use the available (taxonomically labeled) 16S sequences to check which organisms are targeted by the primers http://www.arb-silva.de/search/testprobe (to test single probes) http://www.arb-silva.de/search/testprime (to test pairs of probes, below)
An example on “universal” primers Fw: CCTACGGGRSGCAGCAG Rev: ATTACCGCGGCTGCT (ourprimers)
An example on “universal” primers Archaea, 49.2% matches Bacteria, 94.7% matches Proteobacteria, 97.1 % matches WS6 candidate division, 2.9 % matches BE AWARE: universal primers do not exists, and the choice of the primers is going to bias your study no matter what!
Validation of hypervariable regions using a mock community Ward, Doyle V., et al. PloSone 7.6 (2011): e39315-e39315.
A high level 16S analysis workflow Hamady, Micah, and Rob Knight. Genome research 19.7 (2009): 1141-1152.
Schematic 16S analysis workflow Input dataset (one sample) Multiple-sequence alignment Operational taxonomic unit (OTUs) definition CAAGCCGAAUGCAGCUAUUC CAAGCCUGAUGCAGCCAUGC CAUGCCUGAGACAGCCUUGC CAAGCCUGAUGCAGCCAUGC CAAGCCGAAUGCAGCUAUCC CAAGGCUGAGACAGCCUUGC CAAGCCUGAUGCUGCCAUGC CAAGCCGAAUGCAGCUAUGC CAAGCCGGAGACAGCCUUGC AAAGCCUGAUGCAGCCAUGC CAAGCCGAAUGCAGCUAUUC CAAGCCUGAUGCAGCCAUGC CAUGCCUGAGACAGCCUUGC CAAGCCUGAUGCAGCCAUGC CAAGCCGAAUGCAGCUAUCC CAAGGCUGAGACAGCCUUGC CAAGCCUGAUGCUGCCAUGC CAAGCCGAAUGCAGCUAUGC CAAGCCGGAGACAGCCUUGC CAAGCCUGAUGCAGCCAUGC CAAGCCGAAUGCAGCUAUUC CAAGCCGAAUGCAGCUAUCC CAAGCCGAAUGCAGCUAUGC CAUGCCUGAGACAGCCUUGC CAAGGCUGAGACAGCCUUGC CAAGCCGGAGACAGCCUUGC CAAGCCUGAUGCAGCCAUGC CAAGCCUGAUGCAGCCAUGC CAAGCCUGAUGCUGCCAUGC AAAGCCUGAUGCAGCCAUGC OTU_1 OTU_2 OTU_3 OTU_1 OTU_3 OTU_2 OTU_1 30% OTU_2 30% 16S DB with taxonomic information OTU_3 40% OTU_1 E. coli OTU_2 S. aureus OTU_3 S. pneumoniae
Intro into diversity analysis • Alpha-diversity • A measure of how diverse (complex) a microbial community is • “within sample” diversity • Species richness (i.e. number) is a widely use alpha diversity index • Beta-diversity • A measure of how different two microbial communities are • “between sample” diversity • Inverse of number of shared species is one possibility to estimate beta-diversity Jurasinski, G., Retzer, V., & Beierkuhnlein, C. (2009). Oecologia, 159(1), 15-26.
Practical tutorial time http://nickloman.github.io