560 likes | 718 Views
Evolution of regulatory interactions in bacteria. Mikhail Gelfand Institute for Information Transmission Problems, RAS 4th Bertinoro Computational Biology (BCB) M eeting “ Evolution of and Comparative Approaches to Gene Regulation ” 24-30 June 2006.
E N D
Evolution of regulatory interactions in bacteria Mikhail Gelfand Institute for Information Transmission Problems, RAS 4th Bertinoro Computational Biology (BCB) Meeting“Evolution of and Comparative Approaches to Gene Regulation” 24-30 June 2006
Это – ряд наблюдений. В углу – тепло. Взгляд оставляет на вещи след. Вода представляет собой стекло. Человек страшней, чем его скелет.Иосиф Бродский A list of some observations. In a corner, it’s warm. A glance leaves an imprint on anything it’s dwelt on. Water is glass’s most public form. Man is more frightening than its skeleton. Joseph Brodsky
Plan • Evolution of individual sites • Coevolution of transcription factors and their binding signals • Distribution of transcription factor families in various genomes • Evolution of simple and complex regulatory systems
Birth and death of sites is a very dynamic process NadR-binding sites upstream of pnuB seem absent in Klebsiella pneumoniae and Serratia marcescens
Loss of regulators and cryptic sites Loss of the RbsR in Y. pestis (ABC-transporter also is lost) RbsR binding site Start codon of rbsD
Unexpected conservation of non-consensus positions in orthologous sites regulatory site of LexA upstream of lexAconsensus nucleotides are in caps wrong consensus?
TF PurR, gene purL TF PurR,genepurM
Non-consensus positions are more conserved than synonymous codon positions
Relative conservation of non-consensus nucleotides may be higher than conservation of consensus nucleotides
Regulators and their signals • Subtle changes at close evolutionary distances • Cases of signal conservation at surprisingly large distances • Changes in spacing / geometry of dimers • Correlation between contacting nucleotides and amino acid residues
The LacI family: subtle changes in signals at close distances G n A CG Gn GC
NrdR (regulator of ribonucleotide reducases and some other replication-related genes): conservation at large distances
Profile 1:Gram-positive bacteria, Archaea Profile 2:Gram-negative bacteria BirA (biotin regulator in eubacteria and archaea): conserved signal, changed spacing
DNA signals and protein-DNA interactions Entropy at aligned sites and the number of contacts (heavy atoms in a base pair at a distance <cutoff from a protein atom) CRP PurR IHF TrpR
Specificity-determining positions in the LacI family • Training set: 459 sequences, average length: 338 amino acids, 85 specificity groups – 44 SDPs 10residues contactNPF (analog of the effector) 7 residues in the effector contact zone (5Ǻ<dmin<10Ǻ) 6 residues in the intersubunit contacts 5 residues in the intersubunit contact zone (5Ǻ<dmin<10Ǻ) 7residues contact the operator sequence 6 residues in the operator contact zone (5Ǻ<dmin<10Ǻ) LacI from E.coli
Correlation between contacting nucleotides and amino acid residues • CooA in Desulfovibrio spp. • CRP in Gamma-proteobacteria • HcpR in Desulfovibrio spp. • FNR in Gamma-proteobacteria Contacting residues: REnnnR TG: 1st arginine GA: glutamate and 2nd arginine DD COOA ALTTEQLSLHMGATRQTVSTLLNNLVR DV COOA ELTMEQLAGLVGTTRQTASTLLNDMIR EC CRP KITRQEIGQIVGCSRETVGRILKMLED YP CRP KXTRQEIGQIVGCSRETVGRILKMLED VC CRP KITRQEIGQIVGCSRETVGRILKMLEE DD HCPR DVSKSLLAGVLGTARETLSRALAKLVE DV HCPR DVTKGLLAGLLGTARETLSRCLSRMVE EC FNR TMTRGDIGNYLGLTVETISRLLGRFQK YP FNR TMTRGDIGNYLGLTVETISRLLGRFQK VC FNR TMTRGDIGNYLGLTVETISRLLGRFQK TGTCGGCnnGCCGACA TTGTGAnnnnnnTCACAA TTGTgAnnnnnnTcACAA TTGATnnnnATCAA
Distribution of TF families in bacterial genomes ExtraTrain database Streptomyces coelicolor LysR Pseudomonas aeruginosa TetR AraC LuxR GntR LacI Agrobacterium tumefaciens Escherichia coli Bacillus subtilis
Strategies of successful TF families • One ortholog per genome: • LexA, NrdR, HrcA, ArgR • present even in archaea: BirA (also enzyme), ModE • Several (2-3) orthologs per genome • CRP/FNR, FUR • Local explosions • LacI in alpha- and gamma-proteobacteria • 2CS systems in delta-proteobacteria • sigma-factors in Streptomyces • Because TF in a family tend to have related functions and these might depend on the lifestyle?
LacI family regulons in closely related strains (top: TFs, bottom: regulated genes) Seven Escherichia and Shigella spp. Four Bacillus cereus and B. anthracis strains Five Salmonella spp.
What are the driving forces for the present-day state? • Expansion and contraction of regulons • Duplications of regulators with or without regulated loci • Loss of regulators with or without regulated loci • Re-assortment of regulators and structural genes • … especially in complex systems • Horizontal transfer
Regulon expansion: how FruR has become CRA Mannose Glucose ptsHI-crr manXYZ edd epd eda adhE aceEF icdA pykF ppsA mtlD mtlA Mannitol pgk gpmA pckA gapA fbp pfkA aceA tpiA fruBA fruK Fructose aceB Gamma-proteobacteria
Common ancestor of Enterobacteriales Mannose Glucose ptsHI-crr manXYZ edd epd eda adhE aceEF icdA pykF ppsA mtlD mtlA Mannitol pgk gpmA pckA gapA fbp pfkA aceA tpiA fruBA fruK Fructose aceB Gamma-proteobacteria Enterobacteriales
Common ancestor of Escherichia and Salmonella Mannose Glucose ptsHI-crr manXYZ edd epd eda adhE aceEF icdA pykF ppsA mtlD mtlA Mannitol pgk gpmA pckA gapA fbp pfkA aceA tpiA fruBA fruK Fructose aceB Gamma-proteobacteria Enterobacteriales E.coli and Salmonella spp.
Trehalose/maltose catabolism in alpha-proteobacteria Duplicated LacI-family regulators: lineage-specific post-duplication loss
The binding signals are very similar (the blue branch is somewhat different: to avoid cross-recognition?)
Utilization of an unknown galactoside in gamma-proteobacteria Yersinia and Klebsiella: two regulons, GalR (not shown, includes genes galK and galT) and Laci-X Erwinia: one regulon, GalR Loss of regulator and merger of regulons: It seems that laci-X was present in the common ancestor (Klebsiella is an outgroup)
Utilization of maltose/maltodextrin in Firmicutes Two different ABC transporters (shades of red) PTS (pink) Glucoside hydrolases (shades of green) Two regulators (black and grey)
Modularity of the functional subsystem Two different ABC systems Three hydrolases in one operon (E. faecalis) or separately
Changes of regulation Two different ABC systems Displacement:invasion of a regulator from a different subfamily (horizontal transfer from a related species?) – blue sites
Utilization of xylose in alpha-proteobacteria xylBA Three different ABC transporters Three regulators: two from the LacI family and one from the ROK family
Changes in regulation Displacement: Operon regulation changed from XylR-1 to XylR-2 (different subfamily) Duplication and displacement: Duplicated XylR-1a assumed the role of the ROK-family regulator
extreme variability of regulation of “marginal” regulon members β γ Pseudomonas spp.
Regulation of amino acid biosynthesis in Firmicutes • Interplay between regulatory RNA elements and transcription factors • Expansion of T-box systems (normally RNA structures regulating aminoacyl-tRNA-synthetases)
Five regulatory systems for the methionine biosynthesis • SAM-dependent RNA riboswitch • Met-tRNA-dependent T-box (RNA) C,D,E. repressors of transcription
Methionine regulatory systems: loss of S-box regulons ZOO • S-boxes (SAM-1 riboswitch) • Bacillales • Clostridiales • the Zoo: • Petrotoga • actinobacteria (Streptomyces, Thermobifida) • Chlorobium, Chloroflexus, Cytophaga • Fusobacterium • Deinococcus • proteobacteria (Xanthomonas, Geobacter) • Met-T-boxes (Met-tRNA-dependent attenuator) + SAM-2 riboswitch for metK • Lactobacillales • MET-boxes(candidate transcription signal) • Streptococcales Lact. Strep. Bac. Clostr.
Mapping the events to the phylogenetic tree loss of S-boxes (SAM-I riboswitches) expansion of Met-T-boxes, emergence of SAM-2 riboswitches Trp-T-boxes TRAP Tyr-T-boxes PCE emergence of MtaR Tyr-T-boxes ARO Bacillus subtilis and related species Bacillus cereus and related species Lacto-bacillus spp. Strepto-coccus spp. Clostridium spp.
Combined regulatory network for iron homeostasis genes in in a-proteobacteria. Irr Irr RirA RirA FeS heme degraded 2+ 3+ S i d e r o p h o r e F e / F e I r o n - r e q u i r i n g I r o n s t o r a g e F e S H e m e T r a n s c r i p t i o n u p t a k e u p t a k e e n z y m e s f e r r i t i n s s y n t h e s i s s y n t h e s i s f a c t o r s I r o n u p t a k [ i r o n c o f a c t o r ] e s y s t e m s IscR Fur Fur Fe [+Fe] [+Fe] [- Fe] [ Fe] - FeS status of cell FeS [- Fe] [+Fe] The connecting line denote regulatory interactions, which the thickness reflecting the frequency of the interaction in the analyzed genomes. The suggested negative or positive mode of operation is shown by dead-end and arrow-end of the line.
Distribution of Irr, Fur/Mur, MntR, RirA, and IscR regulons in α-proteobacteria Fe and Mn regulons MUR / Irr Group RirA IscR Organism Abb. MntR F UR - - SM + + + Sinorhizobium meliloti - - + + + + Rhizobium leguminosarum RL Rhizobiaceae - - + + + Rhizobium etli RHE - - + + + Agrobacterium tumefaciens AGR A. - - + + + ML Mesorhizobium loti - - + + + + Mesorhizobium sp. BNC1 MBNC - - + + + Brucella melitensis BME Rhizobiales - - + + + BQ Bartonella quintana and spp. - - - + + + Bradyrhizobium japonicum BJ - - - + + + RPA Rhodopseudomonas palustris B. - - - + + Nham Nitrobacter hamburgensis Bradyrhizobiaceae - - - + + Nitrobacter winogradskyi Nwi - RC + + + + Rhodobacter capsulatus - + + + + Rhodobacter sphaeroides Rsph - STM + + + + Silicibacter sp. TM1040 - + + + + S PO Silicibacter pomeroyi - + + #? + Jannaschia sp.CC51 Jann Rhodo- - bacteraceae HTCC2654 + + + + Rhodobacterales bacterium RB2654 C. - + + + + Roseobacter sp. MED193 MED193 - #? ISM + + + Roseovarius nubinhibens ISM Rhodo- - - bacterales sp.217 + + + + Roseovarius ROS217 p - + + #? + r Loktanella vestfoldensis SKA53 SKA53 o - t EE-36 + + + Sulfitobacter sp. EE36 #? e o - #? HTCC2597 + + + Oceanicola batsensis OB2597 b Hyphomonadaceae a - - - HTCC2633 + + Oceanicaulis alexandrii OA2633 c t Caulobacterales e - - - CC + + Caulobacter crescentu s r i Parvularculales a - - - + + Parvularcula bermudensis HTCC2503 PB2503 - - - + + Erythrobacter litoralis ELI - - - + + Saro Novosphingobium aromaticivorans Sphingomo- - - - + + nadales Sphinopyxis alaskensis g RB2256 Sala D. - - - + + Zymomonas mobilis ZM Rhodo- - - + + + Gluconobacter oxydans GOX spirillales - - - + + + Rrub Rhodospirillum rubrum - - - + + + Amb Magnetospirillum magneticum SAR11 cluster - - + + HTCC1002 + Pelagibacter ubique PU1002 Rickettsiales - - - - + Rickettsia Ehrlichia and species #?' in RirA column denotes the absence of the rirA gene in an unfinished genomic sequence and the presence of candidate RirA-binding sites upstream of the iron uptake genes.
Distribution of the conserved members of the Fe- and Mn-responsive regulons and the predicted RirA, Fur/Mur, Irr, and DtxR binding sites in a-proteobacteria Genes Functions: Iron uptake Iron storage FeS synthesis Iron usage Heme biosynthesis Regulatory genes Manganese uptake
Phylogenetic tree of the Fur family of transcription factors in a-proteobacteria - I Escherichia coli : P0A9A9 sp| ECOLI Fur Pseudomonas aeruginosa : sp|Q03456 PSEAE Neisseria meningitidis : sp|P0A0S7 NEIMA HELPY : sp|O25671 Helicobacter pylori BACSU Bacillus subtilis : P54574 sp| SM mur Sinorhizobium meliloti MBNC03003179 Mesorhizobium sp. BNC1 (I) BQ fur2 Bartonella quintana BMEI0375 Brucella melitensis EE36 12413 sp. EE-36 Sulfitobacter a MBNC03003593 sp. BNC1 (II) Mesorhizobium RB2654 19538 HTCC2654 Rhodobacterales bacterium AGR C 620 Agrobacterium tumefaciens RHE_CH00378 Rhizobium etli RL mur Rhizobium leguminosarum Nham 0990 Mur Nitrobacter hamburgensis X14 Nwi 0013 Nitrobacter winogradskyi RPA0450 Rhodopseudomonas palustris BJ fur Bradyrhizobium japonicum ROS217 18337 Roseovarius sp.217 Jann 1799 Jannaschia sp. CC51 SPO2477 Silicibacter pomeroyi STM1w01000993 Silicibacter sp. TM1040 MED193 22541 sp. MED193 Roseobacter OB2597 02997 HTCC2597 Oceanicola batsensis SKA53 03101 Loktanella vestfoldensis SKA53 Rsph03000505 Rhodobacter sphaeroides ISM 15430 Roseovarius nubinhibens ISM PU1002 04436 Pelagibacter ubique HTCC1002 GOX0771 Gluconobacter oxydans ZM01411 Zmomonas mobilis y Saro02001148 Novosphingobium aromaticivorans a Sala 1452 RB2256 Sphinopyxis alaskensis Fur ELI1325 Erythrobacter litoralis OA2633 10204 Oceanicaulis alexandrii HTCC2633 PB2503 04877 Parvularcula bermudensis HTCC2503 CC0057 Caulobacter crescentus Rrub02001143 Rhodospirillum rubrum Amb1009 (I) Magnetospirillum magneticum a Amb4460 Magnetospirillum magneticum (II) Irr Fur in g- and b- proteobacteria Fur in e- proteobacteria Fur in Firmicutes in a-proteobacteria Regulator of manganese uptake genes (sit, mntH) in a-proteobacteria Regulator of iron uptake and metabolism genes a-proteobacteria
Erythrobacter litoralis Caulobacter crescentus Novosphingobium aromaticivorans Zymomonas mobilis Sequence logos for the identified Fur-binding sites in the D group of a-proteobacteria Oceanicaulis alexandrii Sphinopyxis alaskensis Rhodospirillum rubrum Gluconobacter oxydans Parvularcula bermudensis - Magnetospirillum magneticum Identified Mur-binding sites Bacillus subtilis The A, B, and C groups Sequence logos for the known Fur-binding sites in Escherichia coli and Bacillus subtilis Mur a of - proteobacteria - Escherichia coli
Phylogenetic tree of the Fur family of transcription factors in a-proteobacteria - II Escherichia coli ECOLI : P0A9A9 sp| Fur Pseudomonas aeruginosa : sp|Q03456 PSEAE Neisseria meningitidis : sp|P0A0S7 NEIMA HELPY Helicobacter pylori : sp|O25671 BACSU Bacillus subtilis : P54574 sp| a Mur / Fur AGR C 249 Agrobacterium tumefaciens SM irr Sinorhizobium meliloti RHE CH00106 Rhizobium etli RL irr1 Rhizobium leguminosarum (I) RL irr2 Rhizobium leguminosarum (II) MLr5570 Mesorhizobium loti MBNC03003186 sp. BNC1 Mesorhizobium BQ fur1 Bartonella quintana BMEI1955 Brucella melitensis (I) BMEI1563 Brucella melitensis (II) BJ blr1216 (II) Bradyrhizobium japonicum RB2654 182 Rhodobacterales bacterium HTCC2654 SKA53 01126 Loktanella vestfoldensis SKA53 ROS217 15500 Roseovarius sp.217 ISM 00785 ISM Roseovarius nubinhibens OB2597 14726 Oceanicola batsensis HTCC2597 Jann 1652 sp. CC51 Jannaschia a I r r - Rsph03001693 Rhodobacter sphaeroides EE36 03493 Sulfitobacter sp. EE-36 STM1w01001534 sp. TM1040 Silicibacter MED193 17849 Roseobacter sp. MED193 SPOA0445 Silicibacter pomeroyi RC irr Rhodobacter capsulatus RPA2339 (I) Rhodopseudomonas palustris RPA0424* Rhodopseudomonas palustris (II) BJ irr* (I) Bradyrhizobium japonicum Nwi 0035* Nitrobacter winogradskyi Nham 1013* Nitrobacter hamburgensis X14 PU1002 04361 Pelagibacter ubique HTCC1002 Fur in g- and b- proteobacteria Fur in e- proteobacteria Fur in Firmicutes a-proteobacteria Irrin a-proteo- bacteria regulator of iron homeostasis