460 likes | 590 Views
Evolution of bacterial regulatory systems. Mikhail Gelfand Research and Training Center “Bioinformatics” Institute for Information Transmission Problems Moscow, Russia. January 2008. Plan. Individual sites Transcription factors and their binding signals Regulatory systems and regulons.
E N D
Evolution of bacterial regulatory systems Mikhail Gelfand Research and Training Center “Bioinformatics” Institute for Information Transmission Problems Moscow, Russia January 2008
Plan • Individual sites • Transcription factors and their binding signals • Regulatory systems and regulons
Birth and death of sites is a very dynamic process NadR-binding sites upstream of pnuB seem absent in Klebsiella pneumoniae and Serratia marcescens
Cryptic sites and loss of regulators Loss of RbsR in Y. pestis (ABC-transporter also is lost) RbsR binding site Start codon of rbsD
Unexpected conservation of non-consensus positions in orthologous sites regulatory site of LexA upstream of lexAconsensus nucleotides are in caps wrong consensus?
TF PurR, gene purL TF PurR,genepurM
Non-consensus positions are more conserved than synonymous codon positions
Regulators and their motifs • Cases of motif conservation at surprisingly large distances • Subtle changes at close evolutionary distances • Correlation between contacting nucleotides and amino acid residues • Changes in symmetry patterns
NrdR (regulator of ribonucleotide reducases and some other replication-related genes): conservation at large distances
DNA motifs and protein-DNA interactions Entropy at aligned sites and the number of contacts (heavy atoms in a base pair at a distance <cutoff from a protein atom) CRP PurR IHF TrpR
The LacI family: subtle changes in motifs at close distances G n A CG Gn GC
Specificity-determining positions in the LacI family Training set: 459 sequencesaverage length: 338 amino acids,85 specificity groups – 44 SDPs 10residues contactNPF (analog of the effector) 7 residues in the effector contact zone (5Ǻ<dmin<10Ǻ) 6 residues in the intersubunit contacts 5 residues in the intersubunit contact zone (5Ǻ<dmin<10Ǻ) 7residues contact the operator sequence 6 residues in the operator contact zone (5Ǻ<dmin<10Ǻ) LacI from E.coli
Correlation between contacting nucleotides and amino acid residues Contacting residues: REnnnR TG: 1st arginine GA: glutamate and 2nd arginine • CooA in Desulfovibrio spp. • CRP in Gamma-proteobacteria • HcpR in Desulfovibrio spp. • FNR in Gamma-proteobacteria DD COOA ALTTEQLSLHMGATRQTVSTLLNNLVR DV COOA ELTMEQLAGLVGTTRQTASTLLNDMIR EC CRP KITRQEIGQIVGCSRETVGRILKMLED YP CRP KXTRQEIGQIVGCSRETVGRILKMLED VC CRP KITRQEIGQIVGCSRETVGRILKMLEE DD HCPR DVSKSLLAGVLGTARETLSRALAKLVE DV HCPR DVTKGLLAGLLGTARETLSRCLSRMVE EC FNR TMTRGDIGNYLGLTVETISRLLGRFQK YP FNR TMTRGDIGNYLGLTVETISRLLGRFQK VC FNR TMTRGDIGNYLGLTVETISRLLGRFQK TGTCGGCnnGCCGACA TTGTGAnnnnnnTCACAA TTGTgAnnnnnnTcACAA TTGATnnnnATCAA
NrtR (regulator of NAD metabolism): systematic search for correlated positions • analysis of correlated positions in proteins and sites • analysis of specificity determining positions • the same positions in one alpha-helix identified • plans for experimental verification
NiaR: changed dimer structure? The GalR family and C-proteins of RM-systems: direct and inverted repeats BirA: changed spacing
What are the events leading to the present-day state? • Expansion and contraction of regulons • New regulators (where from?) • Duplications of regulators with or without regulated loci • Loss of regulators with or without regulated loci • Re-assortment of regulators and structural genes • … especially in complex systems • Horizontal transfer
Trehalose/maltose catabolism in alpha-proteobacteria Duplicated LacI-family regulators: lineage-specific post-duplication loss
The binding motifs are very similar (the blue branch is somewhat different: to avoid cross-recognition?)
Utilization of an unknown galactoside in gamma-proteobacteria Yersinia and Klebsiella: two regulons, GalR and Laci-X Erwinia: one regulon, GalR Loss of regulator and merger of regulons: It seems that laci-X was present in the common ancestor (Klebsiella is an outgroup)
Utilization of maltose/maltodextrin in Firmicutes Displacement:invasion of a regulator from a different subfamily (horizontal transfer from a related species?) – blue sites
Orthologous TFs with completely different regulons (alpha-proteobaceria and Xanthomonadales)
Extreme variability of the regulation of “marginal” regulon members β γ Pseudomonas spp.
Regulation of amino acid biosynthesis in Firmicutes • Interplay between regulatory RNA elements and transcription factors • Expansion of T-box systems (normally – RNA structures regulating aminoacyl-tRNA-synthetases)
Three regulatory systems for the methionine bio-synthesis • SAM-dependent riboswitch • Met-T-box C. MtaR: repressor of transcription MtaR
Methionine regulatory systems: loss of S-box regulons ZOO • S-boxes (SAM-1 riboswitch) • Bacillales • Clostridiales • the Zoo: • Petrotoga • actinobacteria (Streptomyces, Thermobifida) • Chlorobium, Chloroflexus, Cytophaga • Fusobacterium • Deinococcus • proteobacteria (Xanthomonas, Geobacter) • Met-T-boxes (Met-tRNA-dependent attenuator) + SAM-2 riboswitch for metK • Lactobacillales • MET-boxes(candidate transcription signal) • Streptococcales Lact. Strep. Bac. Clostr.
Recent duplications and bursts: Arg-T-box in Clostridium difficile
Regulon expansion, or how FruR has become CRA • CRA (a.k.a. FruR) in Escherichia coli: • global regulator • well-studied in experiment (many regulated genes known) • Going back in time: looking for candidate CRA/FruR sites upstream of (orthologs of) genes known to be regulated in E.coli
Common ancestor of gamma-proteobacteria Mannose Glucose ptsHI-crr manXYZ edd epd eda adhE aceEF icdA pykF ppsA mtlD mtlA Mannitol pgk gpmA pckA gapA fbp pfkA aceA tpiA fruBA fruK Fructose aceB Gamma-proteobacteria
Common ancestor of the Enterobacteriales Mannose Glucose ptsHI-crr manXYZ edd epd eda adhE aceEF icdA pykF ppsA mtlD mtlA Mannitol pgk gpmA pckA gapA fbp pfkA aceA tpiA fruBA fruK Fructose aceB Gamma-proteobacteria Enterobacteriales
Common ancestor of Escherichia and Salmonella Mannose Glucose ptsHI-crr manXYZ edd epd eda adhE aceEF icdA pykF ppsA mtlD mtlA Mannitol pgk gpmA pckA gapA fbp pfkA aceA tpiA fruBA fruK Fructose aceB Gamma-proteobacteria Enterobacteriales E.coli and Salmonella spp.
Regulation of iron homeostasis (the Escherichia coli paradigm) Iron: • essential cofactor (limiting in many environments) • dangerous at large concentrations FUR (responds to iron): • synthesis of siderophores • transport (siderophores, heme, Fe2+, Fe3+) • storage • iron-dependent enzymes • synthesis of heme • synthesis of Fe-S clusters Similar in Bacillus subtilis
[+Fe] [+Fe] [- Fe] [ Fe] - Irr Irr RirA RirA FeS heme degraded 2+ 3+ S i d e r o p h o r e F e / F e I r o n - r e q u i r i n g I r o n s t o r a g e F e S H e m e T r a n s c r i p t i o n u p t a k e u p t a k e e n z y m e s f e r r i t i n s s y n t h e s i s s y n t h e s i s f a c t o r s I r o n u p t a k [ i r o n c o f a c t o r ] e s y s t e m s FeS status IscR Fur Fur of cell Fe FeS [- Fe] [+Fe] Regulation of iron homeostasis in α-proteobacteria Experimental studies: • FUR/MUR: Bradyrhizobium, Rhizobium and Sinorhizobium • RirA (Rrf2 family): Rhizobium and Sinorhizobium • Irr (FUR family): Bradyrhizobium, Rhizobium and Brucella
Distribution of transcription factors in genomes Search for candidate motifs and binding sites using standard comparative genomic techniques
Regulation of genes in functional subsystems Rhizobiales Bradyrhizobiaceae Rhodobacteriales The Zoo (likely ancestral state)
Reconstruction of history Frequent co-regulation with Irr Strict division of function with Irr Appearance of theiron-Rhodo motif
2 All logos and Some Very Tempting Hypotheses: • Cross-recognition of FUR and IscR motifs in the ancestor. • When FUR had become MUR, and IscR had been lost in Rhizobiales, emerging RirA (from the Rrf2 family, with a rather different general consensus) took over their sites. • Iron-Rhodo boxes are recognized by IscR: directly testable 1 3
Summary and open problems • Regulatory systems are very flexible • easily lost • easily expanded (in particular, by duplication) • may change specificity • rapid turnover of regulatory sites • With more stories like these, we can start thinking about a general theory • catalog of elementary events; how frequent? • mechanisms (duplication, birth e.g. from enzymes, horizontal transfer) • conserved (regulon cores) and non-conserved (marginal regulon members) genes in relation to metabolic and functional subsystems/roles • (TF family-specific) protein-DNA recognition code • distribution of TF families in genomes; distribution of regulon sizes; etc.
Andrei A. Mironov – software, algorithms Alexandra Rakhmaninova – SDP, protein-DNA correlations Anna Gerasimova (now at U. Michigan) – NadR Olga Kalinina (on loan to EMBL) – SDP Yuri Korostelev – protein-DNA correlations Ekateina Kotelnikova (now at Ariadne Genomics) – evolution of sites Olga Laikova – LacI Dmitry Ravcheev– CRA/FruR Dmitry Rodionov (on loan to Burnham Institute) – iron etc. Alexei Vitreschak – T-boxes and riboswitches Andy Jonson (U. of East Anglia) – experimental validation (iron) Leonid Mirny (MIT) – protein-DNA, SDP Andrei Osterman (Burnham Institute) – experimental validation Howard Hughes Medical Institute Russian Foundation of Basic Research Russian Academy of Sciences, program “Molecular and Cellular Biology” INTAS People