380 likes | 811 Views
The lac repressor-operator system: Swimming in Data Collaborators: Mitch Lewis , Bob Daber, Leslie Milk, Matt Sochor, Chuck Bell, Steve Stayrook Thermodynamics of Allostery Kinetics of Allostery: Induced Fit or Landscape Shift? Large Scale Analysis of base sequence specificity/affinity
E N D
The lac repressor-operator system:Swimming in Data • Collaborators: • Mitch Lewis, Bob Daber, Leslie Milk, Matt Sochor, Chuck Bell, Steve Stayrook • Thermodynamics of Allostery • Kinetics of Allostery: Induced Fit or Landscape Shift? • Large Scale Analysis of base sequence specificity/affinity
Repressor has two conformations R: Active form, binds DNA tightly, Inducer weakly R*: Induced form, binds DNA weakly, Inducer tightly
Repressor binds O1 operator site >1000 more tightly than non-specific DNA
Symmetrized O1 operator G4 T5 G6 Q18 R22 Y17 Position L87654321.12345678R Base TTGTGAGC.GCTCACAA Residue RQY YQR / | \ aa number 22 18 17
How does the lac genetic switch work? Mechanism of allostery Thermodynamics Kinetics The origin of base sequence specific recognition of DNA by proteins Prototype for gene therapy Design of Tools for DNA manipulation Cronin, et al lac operator-repressor system is functional in the mouse Genes & Dev. 2001. 15: 1506-1517
in-vivo system for evolution and functional characterization of lac repressor (Lewis Lab) Expression/Assay System Two plasmid system: one contains a Lac repressor gene other contains the GFPmut3.1 gene controlled by the Lac promoter and a given operator. FACS used to screen and separate phenotypes by GFP fluorescence. Directed evolution: Randomize plasmid sequence corresponding to given aa positions in repressor Screen for given phenotype Engineered heterodimer: Permits assymmetric DNA recognition domains to target non-symmetric Operator Sequences Knockout one inducer site: Probe allosteric mechanism
E. Coli with GFPmut3.1 reporter and repressor plasmid Fluorescence quantified by plate reader Fractional GFP expression relative to that with no repressor plasmid Induced by IPTG
MWC model for Allostery KRR*: Repressor conformational equilibrium (Induced/active) KIR*, KIR: Inducer binding affinities for induced, active repressor KR*O, KRO: Operator DNA binding affinities for induced, active repressor
MWC model for Allostery O/(O+RO)-> Transcription (mRNA) -> Translation (GFP level) Fractional GFP expression with no inducer Fractional GFP expression at saturation (n=1 inducer site) (n=2 inducer sites)
Repressor Conformation Equilibrium [R*]/[R] = 2 Inducer Binding Affinity Ratio KIR*/KIR = 15 In Vivo Repressor Concentration [R]KRO = 150 Inducer-Repressor Binding Affinity KD,IR* = 4uM All constants are obtained in vivo, without doing a single binding measurement!
KRR*=2 in ‘wrong’ direction. DG 0 This explains why Xtal structures of lac with and without IPTG bound are so similar But why is Repressor conformational equilibrium so weak? DG to drive conformational change available from inducer binding is about 1.6 kcal/mole, or about 3.2kcal/mole total, a fairly modest amount
Cell achieves effective repression in spite of weak equilibrium by setting [R] at 150-fold excess Lac Switch has evolved to combine effective switchability given modest driving force from inducer binding, balancing the conflicting requirements of repression and induction
Comparison of Allostery in lac and Hb Lac Hb # of ligands 2 4 Binding Ratio 15-20 30 Conf. Equilibrium 2 1/1000 Hill # 1.2 >3 Comparison of equilibrium constants with previous in vitro studies
‘Classic’ view of ligand induced conformational change of a protein Ligand L binds, induces conformational change A->B (induced fit) B is of higher free energy than A L binds to B tighter than to A, so now LB has lower free energy than A or LA B A L DG
‘New’ view of ligand induced conformational change of a protein Protein exists in an ensemble of conformations A, B, C….. Higher energy forms less populated. L binds to and ‘selects’ one of the higher energy conformers, lowering its free energy so it becomes the dominant form This is the population selection model, aka the protein landscape model, the protein ensemble model B L A DG
…applied to the Lac-Operon system RO R+O I I RIO RI+O Low inducer, R binds O tightly
…applied to the Lac-Operon system R+O RO I I RI+O RIO High inducer, R dissociates from O
Population selection route? R+O RO RI+O RIO Induced fit route? …applied to the Lac-Operon system This can only be determined by kinetics, not equilibria. Lac is one of the few systems where there is enough kinetic data to definitively discriminate
…applied to the Lac-Operon system 2x109 /M/s RO R+O I I 0.08 /s 5x104 /M/s 0.2 /s 5 /s 5x104 /M/s 40 /s RIO RI+O 2x109 /M/s Association rates depend on concentration In cell, [R] = 1nM [I] varies
…applied to the Lac-Operon system Time constants for various steps at I = 1uM 0.5 s RO R+O I I 12 s 20 s 5 s 0.2 s 20 s 25 ms RIO RI+O 0.5 s
…applied to the Lac-Operon system Time constants for various steps at I = 10uM 0.5 s RO R+O I I 12 s 2 s 5 s 0.2 s 2 s 25 ms RIO RI+O 0.5 s
…applied to the Lac-Operon system Time constants for various steps at I = 100uM 0.5 s RO R+O I I 12 s 0.2 s 5 s 0.2 s 0.2 s 25 ms RIO RI+O 0.5 s
PS route R+O RO RIO RI+O Flux at 1uM IPTG (below induction midpoint) RO→R+O RO+I→RIO RIO→RI+O
R+O RO RIO RI+O IF route Flux at 10uM IPTG near midpoint RO+I→RIO RIO→RI+O RO→R+O
R+O RO RIO RI+O IF route
Repressor is leaky-This is functionally important, since in vivo inducer is metabolic product of enzymes repressed by lac • Leakiness is directly related to repressor-operator affinity, KRO Changes in leakiness, as measured by GFP levels, due to mutation/base changes → R-O affinity changes
Screening for functional Repressor-Operator Sequence Pairs Functional Rules for Lac Repressor-Operator Associations and Implications for Protein-DNA Interactions Milk, Daber and Lewis,Protein Science (2010) Vol 19. A library of Lac mutants, fully randomized at positions 17, 18 and 22 screened against 64 symmetric Lac operator variants. Functional repressors sequenced, purified and assayed with the corresponding operators. Lower GFP expression = Tighter binding. Increase in GFP by IPTG = Inducibility. GFP levels in absence of inducer (leakness) used to calculate change in Repressor-Operator affinity relative to wild type (YQR-GTG). Changes in affinity occur due to localized sequence changes in 3 aa’s or 3 bp’s within the framework of the rest of the lac-operator
Base sequences recognized by a given aa triplet sequence AA Bases AA Bases AAN TGA TTA HNR GTG AAR GAG GGA GTA HQN TTT ACR GAA GCA HQR GTG AGN TGA TTA HSN TGG TTT AGR GAA GGA GGG GTA GTG HSR GAG GAT GGG GTG AIR GGT HTA CTT AKN TAC HTK CTT AKR GAC HTN TTG TTT AMR GAT GGT GTG HTR GTA GTG ANR GTG HVR GTA APR GAA HYR GTG AQR GAT GGG GTG IAA CTA ASA CGA CGT IAF CTA ASL TAG IAG CTA ASN TGA TGG TGT IAN TGA TTA ASR GCA GGG GGT IAR GAA GTA ASS CGA IAY CTA TTA ... CAN TTA IGR GAA GGA GTG TAA CMR GGT GTG IKR GAC CQR GTG IMR GAG CSR GGG GGT INR GTG CTR GAA GGA GGT IQR GTG DAR GTA ISL CGA EAR GTA ISR GAA GCA EMR GTG ITR GAA GCA GTG ESR GGG IWK CTA FAR GAA KAN TGG FKR GAC KAR GAG GGG FMR GTG KGR GTG GAN TTA KMR GGG GTG GAR GAA GCA GGA GTA KNR GGG GCR GAA ... ... GGR GTG YQR GTG (Wild Type) GKR GAC YTR GTG 196 Different AA sequences 26 Different Base sequence
AA sequences recognized by a given Base triplet sequence AGG CGA CGG CGT CTA CTT GAA GAC GAG GAT GCA GGA GGC GGG GGT GTA GTG TAA TAC TAG TGA TGG TGT TTA TTG TTT KSL ASA KSA ASA IAA HTA ACR AKR AAR AMR ACR AAR GSR AGR AIR AAR AGR IGR AKN ASL AAN ASN ASN AAN HTN HMN ASS KSC KSA IAF HTK AGR FKR HAR AQR ASR AGR AQR AMR AGR AMR PAN PKN HGN AGN HGN KSN AGN HQN ATA KSL PSA IAG APR GKR HCR HSR GAR ATR ASR ASR ATR ANR PSN KSL ASN HSN PSN CAN HSN ISL KSM TSA IAY AVR IKR HGR PAR GSR AVR CSR ATR DAR AQR PAN ATN KAN TSN GAN HTN PSA KSY ICK CTR MKR HSR PMR GTR CTR ESR AVR EAR CMR PSN IAN KSA HAN PTA KTA ICN FAR NKR IMR SMR ISR GAR GSR CMR GAR CQR RSL LGN KSC IAN SSA KTD ICY GAR PKR KAR TMR ITR GSR HGR CSR GTR EMR PAN KSF IAY STA KTM IWK GCR SKR PAR PCR GTR HSR CTR HTR FMR PSN KSG IGN KTN TAA GTR TKR PQR PGR IGR KAR GSR HVR GGR PTN KSH SAH TAY HAR RAR PSR NTR KMR GTR IAR GMR TGH KSL SAN IAR RSR SAR PVR KNR KQR LAR GNR TGN KSM SAY IGR SSR SCR SGR KSR KSR MAR GQR KSS SGN ISR SSR STR KTR KTR PAR GTR KSY STN ITR STR TTR NSR PIR PTR HAR KTN TAH LAR TAR NTR PVR QAR HCR PSN TAN MAR TTR PSR SMR SAR HGR RSL TAY MTR PTR SSR SCR HNR RSN TGN PCR QSR STR SGR HQR SSN VAN PVR RAR TMR STR HSR YAN QAR RGR TSR TAR HTR SAR RQR TTR TCR HYR SCR RSR VMR TGR IGR SGR RTR VTR TTR INR TAR SGR VAR IQR TSR SSR VYR ITR TTR STR KGR VAR TGR KMR TSR KQR TTR LMR VSR ... >300 aa-base pair combinations now screened. Now we have a Thermodynamic Model for Induction, all 300+ affinities can be extracted from the leakiness…
AGG CGA CGG CGT CTA CTT GAA GAC GAG GAT GCA GGA GGC GGG GGT GTA GTG TAA TAC TAG TGA TGG TGT TTA TTG TTTAGG CGA CGG CGT CTA CTT GAA GAC GAG GAT GCA GGA GGC GGG GGT GTA GTG TAA TAC TAG TGA TGG TGT TTA TTG TTT Relative Affinity
Origin of sequence specific Protein-DNA Recognition I. Given: 196 variants of Lac differing in aa sequence in the recognition helix, each of which bind specifically to different subsets of 26 DNA base pair sequences, for a total of 331 aa-bp complexes with known affinity. Extract as much sequence level information about specificity as possible to infer sequence recognition ‘rules’. Can take a ‘bioinformatics’ approach
Analysis of aa-bp sequence pair recognition by clustering AA’s Bases Bipartite Graph partitioning
Origin of sequence specific Protein-DNA Recognition II Given: 331 (and counting) amino-acid, base sequence variants and their relative affinities Identify the structural basis for sequence specific protein-DNA recognition using a conformational analysis approach, i.e. by searching through protein and base sequence/conformation space to generate Lac-DNA structural models that explain, and ultimately predict, which amino-acid sequences recognize which base sequences. What structural features determine high affinity, and/or sequence specificity? Can we predict, and so design, repressor sequences that will bind given lac-operator sequences, and more generally, bind any base sequence of the same length? EVOLVE: Searches in both protein and DNA sequence space, with full amino-acid, base rotamer exploration, torsional minimization. Simultaneously generations conformers for bound, unbound states, evaluates energy difference.
Analysis of 75 or 331 amino acid base pair variants so far EVOLVE energy difference vs. Measured Affinity difference. Correlation coefficient = 0.66 Not bad: without full rotamer exploration (depth first), no solvent, and no binding entropy yet