240 likes | 526 Views
PacBio Meets the Microbiome. George Weinstock PacBio Users Group Meeting September 18, 2013. Diverse interest in medical metagenomics. Acne Antibiotics , gut microbiome, and obesity Antibiotic resistance Asthma , allergies Acute RSV infection Vitamin D Bacterial vaginosis
E N D
PacBio Meets the Microbiome George Weinstock PacBio Users Group Meeting September 18, 2013
Diverse interest in medical metagenomics • Acne • Antibiotics, gut microbiome, and obesity • Antibiotic resistance • Asthma, allergies • Acute RSV infection • VitaminD • Bacterial vaginosis • Cancer microbiomes • Conjunctiva – trachoma microbiome • Crohn'sdisease • Cysticfibrosis • Diabetes • Oral microbiome • Skin microbiome • Dietaryeffectsongutmicrobiome • Fecal transplant • HIV and lung microbiome • Infection control • C. difficile • VRE • MRSA • E. coli O157 H7 • NICU bacteremia • Intestinalfatuptake • Necrotizing enterocolitis • Non-Alcoholic Fatty Liver Disease • Oral microbiome • Periodontitis • Caries • Parasitic infection and the microbiome • Post-transplant Lymphoproliferative Disorders • Pre-term birth • Maternal microbiome • Vitamin D • Respiratory microbiome • Influenza infection • Pre-term babies • Childhood vaccination • Sepsis • ICU • NICU • Short-bowel syndromes • Urethritis • Virus discovery • Kawasaki Disease • Fever of unknown origin in children • Transplantation: CMV, BK • Immuno-suppression/-compromised
Approaches to study the microbiome Microbial Community Bacteria Viruses Eukaryotes Targeted Sequencing 16S rRNA Shotgun Sequencing Bacteria Viruses Fungi Yeasts Protists Enzymes Bacterial census Taxa & Abundances All microbes Taxa & Genes Describe communities in many samples “Average” community and variations
Studying communities - 16S rRNA genes Major “enterotypes” of the stool biomes Histograms of genera in each sample Each row a different sample Ruminococcus women men Prevotella St. Louis Houston not hispanic/ latino/spanish Hispanic /latino/spanish Bacteroides BMI>=30 25 <= BMI <30 BMI<25 NA
Some Metagenomic Effects Community Structure e.g. content; ecological parameters (biodiversity) Specific Organism e.g. C. difficile Multiple Specific Organisms beneficial ↓ detrimental↑ Genes or Pathways e.g. lactic acid
Metagenomic pathogen detection in clinical samples Hospital microbiology lab Metagenomic sequencing Patient samples Compare results Patients with/without hospital acquired diarrhea Alexis Elward David Haslam Greg Storch RanaElfeghaly Yanjiao Zhou Kristine Wylie
16S analysis of clinical samples for C. difficile Diagnostic lab results C.diff+highTcdB NC (SE meds) C.diff +high TcdB NC IBD C.diff + low TcdB NC Campy C.diff +low TcdB NC NC Salmonella C.diff +average TcdB NC: various negative controls
Pathogen relative abundance in clinical samples16S read abundance
The bacterial 16S rRNA gene (ssu) Evaluation of 16S rDNA-based community profiling for human microbiome research. Jumpstart Consortium Human Microbiome Project Data Generation Working Group. PLoSOne. 2012;7(6):e39315.
Trends in 16S rRNA gene sequencing Full-length Sanger sequencing PCR => clone => sequence All 9 hypervariable regions Full-length PB PCR => sequence 9 hypervariable regions Expensive Time-consuming Accurate taxa ID 1/3-length 454 sequencing PCR 500bp regions => sequence 2-4 hypervariable regions Inexpensive High-throughput Less accurate taxa ID 1/10-length Illumina sequencing PCR 500bp regions => sequence 1 hypervariable region Very cheap Very high-throughput Less accurate taxa ID
Large-scale single isolate typing • Have ~8000 isolates (microtiter plates) from hospital • Looking for unsequenced species from humans • Need FL 16S in order to make a species call for typing • 400 base reads from 454 do not give enough specificity • Each well has one strain • Sanger seq’ing of FL PCR products => single sequence w/o cloning • Can PacBio compete: cheaper, higher throughput? • Goal: • Find what species these isolates are • Choose novel isolates • Perform WG sequencing
Large-scale single isolate typing with PacBio • Sanger: do not see alleles of multiple 16S genes/strainPacBio: can see different alleles since single molecule • Hospital isolates (82): • 70 samples agree between Sanger and PacBio • 4 samples have minor species seen with both platforms • 5 samples have strain differences seen with PacBio, not w Sanger • 2 samples failed with Sanger, not w PacBio • 1 sample disagreement • 7 DNA sample controls agree between platforms • 4 known culture sample controls agree between platforms • PacBio: can see low level contaminants • 99% agreement between Sanger and PacBio (90/91) • Only 1 disagreement between the platforms • More information from PacBio
Cost is an issue • With 96 samples/1 SMRT cell, the fully loaded cost of PacBio is about 2x Sanger. • SMRT cell • Sequencing reagents • Library kit and labor • Instrument • Computation (storage, labor, cpu) • Would need to pool more samples/SMRT cell • Need more bar codes
Sequencing communities of microbes en masse • 16S rRNA gene sequencing for community profiling • Full-length gives species-level definition • 454 500bp reads give genus-level definition • Shotgun sequencing • Longer reads give better assembly (of unknown uncultured) • Bacteria, viruses, fungi and other eukaryotes described
Simulated community 16S sequencing • A mock community of 24 species • Only 22 amplified with the primers used • Organisms range over 300-fold in abundance • Make 4 different batches • Aim for 5000 sequences/sample (454 protocol)
Mock community analysis with Sanger, 454 Evaluation of 16S rDNA-based community profiling for human microbiome research. Jumpstart Consortium Human Microbiome Project Data Generation Working Group. PLoSOne. 2012;7(6):e39315.
Consistent recognition of an organism in the pool for 4 replicate 16S amplifications: 300-fold difference in prevalence of 16S genes for separate organisms in the pool Methanobrevibacter (an archaea) and Collinsella do not amplify with 16S primers utilized.
INFECTION Replace culture-based analysis with metagenomic analysis Traditional culture-based analysis Sample Culture single species Metagenomic analysis (culture-independent) WGS 16S WGS Assembly, Annotation Assembly Annotation Alignment Strains/Subspecies based on gene content Species present Genes of interest Strains/Subspecies based on SNP/indel content Variants of a species Strains/Subspecies
Acknowledgments • Washington University Genome Institute: • Makedonka Mitreva • Erica Sodergren • SaharAbubucker • Karthik Kota • John Martin • Bruce Rosa • Yanjiao Zhou • Kristine Wylie • Kathie Mihindukulasuriya • HongyuGao • Bill Shannon • Patricio La Rosa • Great Production & Informatics Teams Funding: NIH Gates Foundation • Peer Bork Group • Siegfried Schloissnig • ManimozhiyanArumugam • Shinichi Sunagawa • JulienTap • Ana Zhu • Alison S. Waller • Daniel R. Mende • Shamil R. Sunyaev Thank you to the subjects and their families • Clinical • Greg Storch, WU • Susan Haake, UCLA • Phil Tarr, WU • Martin Blaser, NYU • Barb Warner, WU • Richard Hotchkiss, WU • J. Dennis Fortenberry, Indiana U • Scott Weiss, Harvard • Ellen Li, SUNY-Stony Brook • Katherine Gregory, Harvard • Huiying Li, UCLA • Catherine O’Brien, Toronto • Brad Warner, WU • Homer Twigg, Indiana U • Many others