150 likes | 162 Views
This article discusses the concept of mobilome and the challenges in analyzing its components. It highlights the importance of studying the mobilome in understanding microbial pathogenesis and epidemiological context. The article also explores different approaches for analyzing the mobilome and emphasizes the significance of MGEs in enhancing pathogenicity.
E N D
The Challenges of the Mobilome Alicia Vachon MMIC 7050 October 1st, 2019
Summary • What is a mobilome? • Analysis of the mobilome • Challenges in analysis • Importance of the mobilome in analysis: • Microbial pathogenesis understanding • Epidemiological context • Take home messages (Stallard, https://www.mskcc.org/blog/jumping-genes-and-dark-genome-msk-researchers-gain-new-insight-childhood)
What is a mobilome? • Consists of all mobile genetic elements (MGEs) in a cell, i.e. genomic islands (GIs), pathogenicity islands, resistance islands, etc… • 3classes of MGEs in the mobilome: • Transposons and Integrons • Plasmids • Phages • Transfer of MGEs allows acquisition of accessory genes, which can augment pathogenicity of an organism.
Analysis of the mobilome (1) • Why do we study the mobilome? • To explain why some members of a species are pathogenic and others are not • To explain why some strains can survive in harsh living conditions • To understand functionalities of bacteria and genome evolution
Analysis of the mobilome (2) • Genomic island associated features: • Containing mobility genes (integrases, transposases) • Containing virulence genes • Containing phage-related genes • Flanking tRNA genes • Containing insertion sequences • Flanking direct repeat sequences • Containing hypothetical proteins • Having their own sequence signature • G+C content and GC-skew • k-mer frequency • Codon usage
Analysis of the mobilome (3) • 2sequence-based approaches: • Sequence composition approach • Comparative genomics analysis N.B. A metagenomic approach applies these sequence-based approaches to a community of organisms.
1. Sequence composition approach (1) • Reposes on the theory that all genetic elements from a host share the same genetic signature. Let’s get technical… • Hidden Markov Model (HMM): • considers a sequence as the result of a random process which progresses through states that are hidden to the observer (Przytycka & Zheng) • Each state emits a symbol (nucleotide). emission • Prediction of a change point in genome transition (hidden) • Also used for codon usage prediction • Interpolated variable order motifs (IVOM) and finding k-mers: • For k-mers of size k, there are 4k possible k-mers. • k-mers of too high order will give frequencies of zero. • IVOM uses variable orders of k-mers where 1 ≤ k ≤8. • G+C content and GC-skew • k-mer frequency • Codon usage
1. Sequence composition approach (2) Programs using sequence composition approach: • AlienHunter:predicts frequency of k-mers with interpolated variable order motifs (IVOM) : uses HMM to predict change-point (genome vs. alien) • SIGI-HMM:pattern recognition on codon usage using HMM : use of score system to reflect codon usage and relate gene to island donor where f(G2(codon)) is frequency of codon in a species Advantages: Does not require database, high sensitivity Disadvantage: High false positive rate, phenomenon of amelioration of MGEs can mislead analysis
2. Comparative genomics approach (1) • 3 steps: • Collecting genomes related to query genome • Aligning genomes • Comparing genomes to find gene segments that are present in query genome and not the others (= island regions) (Cheet al., 2014)
2. Comparative genomics approach (2) • IslandPick: • CVTree: uses a distance function to measure relatedness of reference genomes • Calculates frequencies of overlapping oligos and subtracts random background using a Markov chain • All possible sequences of length k are ordered in a composition vector and correlation between 2 vectors (species A vs. species B) is calculated. • Construction of Neighbor-Joining tree to find best reference genome • Mauve:pairwise whole genome alignment to determine island regions Advantages: It is easy to see differences in closely related genomes and identify island regions, CVTree does not use multiple sequence alignment Disadvantages: database of related genomes needed, computational tools doing multiple sequence alignment may need manual adjustments
Challenges in analysis No systemic evaluation of all these computational methods Sequence-based approaches can’t catch all MGEs PCR based methods can cause recombination Short-read sequencing may not get the whole picture Sequence-based analysis does not infer functionality
Mobilome importance in microbial pathogenesis • Pathogenicity islands: regions of genome in pathogenic bacteria that are horizontally transferred. (Cheet al.) • MGEs are the main vehicle for antibacterial resistance (AMR) genes, improve fitness. • Characterization of numerous MGEs helps in discovery of new gene combinations in emerging pathogens. • Example: Legionella pathogenicity and diversity is due to high amount of MGEs in genome. (Gomez-Valero et al.) • HOWEVER, “pathogenicity cannot yet be deduced algorithmically from genomic data.” –Dr. M. Graham
Mobilome importance in epidemiology • Discovery and characterization of the mobilome allows better prediction of gene movement in emerging pathogens. • Example: Monitoring of carbapenem-resistant Enterobacteriaceae to direct hospitals (Pecoraet al.) • Maximum-likelihood tree built to analyze similarity between strains • Mobilome provides timeline of entry of AMR gene in hospitals • SNV analysis allows us to know if a plasmid was transferred or maintained in a clonal strain line. (Pecoraet al., 2015)
Take home messages • The mobilome is an important part of the genome and its analysis is useful for the discovery of emerging pathological traits and explaining pathogenicity as well as epidemiological surveillance and response. • The mobilome can be analyzed by sequence-based methods using a sequence composition approach or a comparative genomics approach. • Current sequencing technologies limit our abilities to analyze and use mobilomic data. • Mobilomics is a promising and beneficial field!
References (1) Siefert, J. L. Defining the Mobilome. Methods Mol. Biol.2009, 532, 13–27. (2) Dahlberg, C.; Bergstro, M.; Andreasen, M.; Christensen, B. B.; Molin, S.; Hermansson, M. Interspecies Bacterial Conjugation by Plasmids from Marine Environments Visualized by Gfp Expression. Mol. Biol. Evol.1996, 15 (4), 385–390. (3) Che, D.; Hasan, M. S.; Chen, B. Identifying Pathogenicity Islands in Bacterial Pathogenomics Using Computational Approaches. Pathogens2014, 3, 36–56. (4) Jørgensen, T. S.; Kiil, A. S.; Hansen, M. A.; Sørensen, S. J.; Hansen, L. H. Current Strategies for Mobilome Research. Front. Microbiol.2015, 5 (January), 1–6. (5) Vernikos, G. S.; Parkhill, J. Interpolated Variable Order Motifs for Identification of Horizontally Acquired DNA: Revisiting the Salmonella Pathogenicity Islands. Bioinformatics2006, 22 (18), 2196–2203. (6) Przytycka, T. M.; Zheng, J. Hidden Markov Models. Encycl. Life Sci. (7) Merkl, R. SIGI: Score-Based Identification of Genomic Islands. BMC Bioinformatics2004, 5, 1–14. (8) Qi, J.; Luo, H.; Hao, B. CVTree: A Phylogenetic Tree Reconstruction Tool Based on Whole Genomes. Nucleic Acids Res.2004, 32 (WEB SERVER ISS.), 45–47. (9) Chu, K. H.; Qi, J.; Yu, Z. G.; Anh, V. Origin and Phylogeny of Chloroplasts Revealed by a Simple Correlation Analysis of Complete Genomes. Mol. Biol. Evol.2004, 21 (1), 200–206. (10) Schoeniger, J. S.; Hudson, C. M.; Bent, Z. W.; Sinha, A.; Williams, K. P. Experimental Single-Strain Mobilomics Reveals Events That Shape Pathogen Emergence. Nucleic Acids Res.2016, 44 (14), 6830–6839. (11) Asante, J.; Sekyere, J. O. Minireview Understanding Antimicrobial Discovery and Resistance from a Metagenomic and Metatranscriptomic Perspective : Advances and Applications. Environ. Microbiol. Rep.2019, 11 (2), 62–86. (12) Gomez-Valero, L.; Rusniok, C.; Rolando, M.; Neou, M.; Dervins-ravault, D.; Demirtas, J.; Rouy, Z.; Moore, R. J.; Chen, H.; Petty, N. K.; et al. Comparative Analyses of Legionella Species Identifies Genetic Features of Strains Causing Legionnaires ’ Disease. Genome Biol.2014, 15 (505), 1–21. (13) Zink, S. D.; Pedersen, L.; Cianciotto, N. P.; Abu Kwaik, Y. The Dot/Icm Type IV Secretion System of LegionellaPneumophila Is Essential for the Induction of Apoptosis in Human Macrophages. Infect. Immun.2002, 70 (3), 1657–1663. (14) Pecora, N. D.; Li, N.; Allard, M.; Li, C.; Albano, E.; Delaney, M.; Dubois, A.; Onderdonk, A. B.; Bry, L. Genomically Informed Surveillance for Carbapenem-Resistant Enterobacteriaceae in a Health Care System. MBio2015, 6 (4), 1–11.