500 likes | 652 Views
Using informatics to focus bacterial pathogenicity studies. Using informatics to focus bacterial pathogenicity studies. Goal: Use informatic analyses to generate new testable hypotheses about pathogen protein function and pathogenicity mechanisms Test the hypotheses in the laboratory.
E N D
Using informatics to focusbacterial pathogenicity studies Goal: Use informatic analyses to generate new testable hypotheses about pathogen protein function and pathogenicity mechanisms Test the hypotheses in the laboratory
Need for informatics in biology: origins • Gramicidine S (Consden et al., 1947), partial insulin sequence (Sanger and Tuppy, 1951) • First codon assignment UUU/phe (Nirenberg and Matthaei, 1961) • 3.5 kb RNA bacteriophage MS2 (Fiers et al., 1976) 5.4 kb bacteriophage X174 (Sanger et al., 1977) • Early databases: Dayhoff, 1972; Erdmann, 1978
Explosion of data • 22 of the 33 publicly available microbial genome sequences are for bacterial pathogens • Approximately 18,000 pathogen genes with no known function! • >95 bacterial pathogen genome projects in progress…
Pathogen Informatics • Pseudomonas aeruginosa • Three dimensional comparative protein modeling • Phylogenetic analysis of gene families • Other analyses: Regulatory network complexity • Pathogenomics Project • Detecting eukaryote:pathogen homologs • Detecting pathogenicity islands
Pseudomonas aeruginosa • Found in soil, water, plants, animals • Common cause of hospital acquired infection: ICU patients, Burn victims, cancer patients • Almost all cystic fibrosis (CF) patients infected by age 10 • Intrinsically resistant to many antibiotics • No vaccine
Outer membrane protein OprF • Nonspecific porin • Required for • Maintenance of cell shape • Growth in low-osmolarity environments • OprF- clinical mutant with multiple antimicrobial resistance being characterized • Adhesin in plant colonizing Pseudomonas species • Proposed vaccine component
Gram Negative Cell Envelope PORE LPS PORIN + + Mg Outer membrane Peptidoglycan Periplasm Cytoplasmic membrane
Structure of the outer membrane protein A transmembrane domain Pautsch and Schulz (1998). Nature Structural Biology 5:1013-1017 No channel formation detected
OprF and OmpA share only 15% identity OprF 1 -QGQNSVEIEAFGKRYFTDSVRNMKN-------ADLYGGSIGYFLTDDVELALSYGEYH OmpA 1 APKDNTWYTGAKLGWSQYHDTGLINNNGPTHENKLGAGAFGGYQVNPYVGFEMGYDWLG * * * * ** * * OprF 52 DVRGTYETGNKKVHGNLTSLDAIYHFGTPGVGLRPYVSAGLA-HQNITNINSDSQGRQQ OmpA 60 RMPYKGSVENGAYKAQGVQLTAKLGYPIT-DDLDIYTRLGGMVWRADTYSNVYGKNHDT * * * * * * * * OprF 110 MTMANIGAGLKYYFTENFFAKASLDGQYGLEKRDNGHQG--EWMAGLGVGFNFG OmpA 118 GVSPVFAGGVEYAITPEIATRLEYQWTNNIGDAHTIGTRPDNGMLSLGVSYRFG * * * * *** **
Model of the N-terminus of OprF based on OmpA Brinkman, Bains and Hancock (2000). Journal of Bacteriology 182:5251-5255
OprF model (yellow and green) aligned with the crystal structure of OmpA (blue) Many residues are in the same three dimensional environment, though on different strands
OprF and OmpA similarity OprF 1 -QGQNSVEIEAFGKRYFTDSVRNMKN-------ADLYGGSIGYFLTDDVELALSYGEYH OmpA 1 APKDNTWYTGAKLGWSQYHDTGLINNNGPTHENKLGAGAFGGYQVNPYVGFEMGYDWLG * * * * ** * * OprF 52 DVRGTYETGNKKVHGNLTSLDAIYHFGTPGVGLRPYVSAGLA-HQNITNINSDSQGRQQ OmpA 60 RMPYKGSVENGAYKAQGVQLTAKLGYPIT-DDLDIYTRLGGMVWRADTYSNVYGKNHDT * * * * * * * * OprF 110 MTMANIGAGLKYYFTENFFAKASLDGQYGLEKRDNGHQG--EWMAGLGVGFNFG OmpA 118 GVSPVFAGGVEYAITPEIATRLEYQWTNNIGDAHTIGTRPDNGMLSLGVSYRFG * * * * *** **
Residues implicated in blocking channel formation in OmpA are not conserved in OprF
Voltage Current Source Amplifier Protein Planar Bathing Bilayer Solution Membrane Planar Lipid Bilayer Apparatus
The N-terminus of OprF forms channels in a lipid bilayer membrane
Upstream of OprF is a probable sigma factor gene, sigX sigX oprF Promoter Transcription terminator
Disruption of sigX reduces expression of OprF • Marker • Wildtype • sigX- mutant • oprF- mutant P. aeruginosa P. fluorescens
No SigX expression: sigX oprF SigX expression: sigX oprF
Percent Regulators as a Function of Genome Size 10 13 Specialized environments Free-living 8 12 11 6 Regulators (%) 8 4 10 9 2 3 1 6 7 2 4 5 0 0 1000 2000 3000 4000 5000 6000 7000 Number of Genes Genomes represented: 1, Mycoplasma genitalium; 2, Chlamydia trachomatis; 3, Treponema pallidum; 4, Borrelia burgdorferi; 5, Chlamydia pneumoniae; 6, Helicobacter pylori ---; 7, Helicobacter pylori---; 8, Haemophilus influenzae; 9, Neisseria meningitidis; 10, Mycobacterium tuberculosis; 11, Bacillus subtilis; 12, Escherichia coli; 13, Pseudomonas aeruginosa.
P. aeruginosa Genome Sequence Analysis: Outer Membrane Proteins (OMPs) Approximately 150 OMPs predicted including three large paralogous families: • OprM Familyof putative Efflux and Type I secretion proteins(18 members) • OprD Familyof putative Amino acid, Peptide and Aromatic compound transporters (19 members) • TonB Familyof putative iron-siderophore receptors (34 members)
OprJ OprM OpmJ OpmB OpmA OprM Family (Multidrug Efflux?) OpmG OpmE OpmI OprN OpmD OpmQ AprF OpmM Protein Secretion? OpmN OpmH TolC OpmK OpmL OpmF 0.1
Future Developments • Modeling of other outer membrane proteins in Neisseria species. • Developing a better algorithms for secondary structure prediction
Pathogenomics Goal: Identify previously unrecognized mechanisms of microbial pathogenicity using a unique combination of informatics, evolutionary biology, microbiology and genetics.
Pathogenicity • Processes of microbial pathogenicity at the molecular level are still minimally understood • Pathogen proteins identified that manipulate host cells by interacting with, or mimicking, host proteins. • Idea: Could we identify novel virulence factors by identifying pathogen genes more similar to host genes than you would expect based on phylogeny?
Eukaryotic-like pathogen genes - YopH, a protein-tyrosine phosphatase, of Yersinia pestis - Enoyl-acyl carrier protein reductase (involved in lipid metabolism) of Chlamydia trachomatis Aquifex aeolicus 96 Haemophilus influenza 100 Escherichia coli Anabaena 100 Synechocystis 100 Chlamydia trachomatis 63 Petunia x hybrida 64 Nicotiana tabacum 83 Brassica napus 99 Arabidopsis thaliana 0.1 52 Oryza sativa
Pathogens Anthrax Necrotizing fasciitis Cat scratch disease Paratyphoid/enteric fever Chancroid Peptic ulcers and gastritis Chlamydia Periodontal disease Cholera Plague Dental caries Pneumonia Diarrhea (E. coli etc.) Salmonellosis Diphtheria Scarlet fever Epidemic typhus Shigellosis Mediterranean fever Strep throat Gastroenteritis Syphilis Gonorrhea Toxic shock syndrome Legionnaires' disease Tuberculosis Leprosy Tularemia Leptospirosis Typhoid fever Listeriosis Urethritis Lyme disease Urinary Tract Infections Meliodosis Whooping cough Meningitis Hospital-acquired infections
Pathogens Chlamydophila psittaci Respiratory disease, primarily in birds Mycoplasma mycoides Contagious bovine pleuropneumonia Mycoplasma hyopneumoniae Pneumonia in pigs Pasteurella haemolytica Cattle shipping fever Pasteurella multicoda Cattle septicemia, pig rhinitis Ralstonia solanacearum Plant bacterial wilt Xanthomonas citri Citrus canker Xylella fastidiosa Citrus variegated chlorosis Bacterial wilt
Interdisciplinary group • Informatics/Bioinformatics • BC Genome Sequence Centre • Centre for Molecular Medicine and Therapeutics • Evolutionary Theory • Dept of Zoology • Dept of Botany • Canadian Institute for Advanced Research • Pathogen Functions • Dept. Microbiology • Biotechnology Laboratory • Dept. Medicine • BC Centre for Disease Control • Host Functions • Dept. Medical Genetics • C. elegans Reverse Genetics Facility • Dept. Biological Sciences SFU
Approach Screen for candidate genes. Search pathogen genes against sequence databases. Identify those with eukaryotic similarity/motifs • Rank candidates. • how much like host protein? • info available about protein? Modify screening method /algorithm Evolutionary significance. - Horizontal transfer? - Similar by chance? Prioritize for biological study. - Previously studied biologically? - Can UBC microbiologists study it? - C. elegans homolog?
Bacillus subtilis Escherichia coli Salmonella typhimurium Staphylococcua aureus Clostridium perfringens Clostridium difficile Trichomonas vaginalis Haemophilus influenzae Acinetobacillus actinomycetemcomitans 0.1 Pasteurella multocida Bacterium Eukaryote Horizontal Transfer N-acetylneuraminate lyase (NanA) of the protozoan Trichomonas vaginalis is 92-95% similar to NanA of Pasteurellaceae bacteria.
N-acetylneuraminate lyase – role in pathogenicity? • Pasteurellaceae • Mucosal pathogens of the respiratory tract • T. vaginalis • Mucosal pathogen, causative agent of the STD Trichomonas
N-acetylneuraminate lyase (sialic acid lyase, NanA) Hydrolysis of glycosidic linkages of terminal sialic residues in glycoproteins, glycolipids Sialidase Free sialic acid Transporter Free sialic acid NanA N-acetyl-D-mannosamine + pyruvate Involved in sialic acid metabolism Role in Bacteria: Proposed to parasitize the mucous membranes of animals for nutritional purposes Role in Trichomonas: ?
Eukaryote Bacteria Horizontal Transfer? Rat 0.1 GMP reductase of E. coli is 81% similar to the corresponding enzyme studied in humans and rats Role in virulence not yet investigated Human Escherichia coli Caenorhabditis elegans Pig roundworm Methanococcus jannaschii Methanobacterium thermoautotrophicum Bacillus subtilis Streptococcus pyogenes Aquifex aeolicus Acinetobacter calcoaceticus Haemophilus influenzae Chlorobium vibrioforme
Hypocrea jecorina EGLII Trichoderma viride EGL2 Penicillium janthinellum EGL2 Macrophomina phaseolina EGL2 Cryptococcus flavus CMC1 Ralstonia solanacearum egl Humicola insolens CMC3 Humicola grisea CMC3 Aspergillus aculeatus CMC2 Aspergillus nidulans EGLA Macrophomina phaseolina egl1 Aspergillus aculeatus CEL1 Aspergillus niger EGLB Vibrio species manA Eukaryote Bacteria Horizontal Transfer? Ralstonia solanacearum cellulase (ENDO-1,4-BETA-GLUCANASE) is 56% similar to endoglucanase present in a number of fungi. Demonstrated virulence factor for plant bacterial wilt
Functional studies Prioritized candidates Study function of gene. Investigate role of bacterial gene in disease: Infection study in model host Study function of similar gene in model host, C. elegans. Contact other groups for possible collaborations. C. elegans DATABASE World Research Community
Pathogenicity Islands • Virulence genes commonly in clusters • Associated with • tRNA sequences • Transposases, Integrases and other mobility genes • Flanked by repeats
G+C Analysis: Identifying Pathogenicity Islands Yellow circle = high %G+C Pink circle = low %G+C tRNA gene lies between the two dots rRNA gene lies between the two dots Both tRNA and rRNA lie between the two dots Dot is named a transposase Dot is named an integrase
Neisseria meningitidis serogroup B strain MC58 Mean %G+C: 51.37 STD DEV: 7.57 %G+C SD Location Strand Product 37.22 -1 1831577..1832527 + pilin gene inverting 39.95 -1 1834676..1835113 + VapD-related 51.96 1835110..1835211 - cryptic plasmid A-related 39.13 -1 1835357..1835701 + hypothetical 40.00 -1 1836009..1836203 + hypothetical 42.86 -1 1836558..1836788 + hypothetical 34.74 -2 1837037..1837249 + hypothetical 43.96 1837432..1838796 + conserved hypothetical 40.83 -1 1839157..1839663 + conserved hypothetical 42.34 -1 1839826..1841079 + conserved hypothetical 47.99 1841404..1843191 - put. hemolysin activ. HecB 45.32 1843246..1843704 - put. toxin-activating 37.14 -1 1843870..1844184 - hypothetical 31.67 -2 1844196..1844495 - hypothetical 37.57 -1 1844476..1845489 - hypothetical 20.38 -2 1845558..1845974 - hypothetical 45.69 1845978..1853522 - hemagglutinin/hemolysin-rel. 51.35 1854101..1855066 + transposase, IS30 family
%G+C of ORFs: Analysis of Variance • %G+C variance is similar within a given species • Low %G+C variance correlates with an intracellular lifestyle for the bacterium and a clonal nature (P = 0.004) • Neisseria meningitidis +/- 7% • Chlamydia species +/- 2% • Intracellular bacteria ecologically isolated?
Future Developments • Identify eukaryotic motifs and domains in pathogen genes • Identify further motifs associated with • Pathogenicity islands • Virulence determinants • Functional tests for new potential virulence factors • www.pathogenomics.bc.ca
Informatics as a focus • Outer membrane protein modeling: Focus mutational studies and studies of surface exposed sequences • Phylogenetic analyses: Focus study of gene mutants under certain environmental conditions • Other analyses - Regulatory network complexity: Change focus of regulation studies • Eukaryote:pathogen homologs: Focus identification of “mimics” • Pathogenicity islands: Focus identification of recently obtained virulence determinants
Acknowledgements • Pathogenomics group: Ann Rose, Steven Jones, Ivan Wan, Hans Greberg, Yossef Av-Gay, David Baillie, Bob Brunham, Stefanie Butland, Rachel Fernandez, Brett Finlay, Patrick Keeling, Audrey de Koning, Sarah Otto, Francis Ouellette, Peter Wall Institute • Pseudomonas Genome Project: PathoGenesis Corp. (Ken Stover) and University of Washington (Maynard Olsen) • Outer membrane proteins: Manjeet Bains, Kendy Wong, Canadian Cystic Fibrosis Foundation • Bob Hancock