530 likes | 632 Views
Multidimensional Molecular Replacement. Nicholas M. Glykos & Michael Kokkinidis IMBB, FORTH, Heraklion, Crete, GREECE. Rigid-body refinement. 2x. Rigid-body simulated annealing. The program :. Name : “Queen of Spades”
E N D
Multidimensional Molecular Replacement. Nicholas M. Glykos & Michael Kokkinidis IMBB, FORTH, Heraklion, Crete, GREECE
The program : • Name : “Queen of Spades” • Availability : absolutely free, open-source software, no warranties whatsoever. • The distribution includes source code, plenty of documentation, plus pre-compiled executables for Irix, OSF, Linux, Solaris, VMS & windoze. • Download the latest version via http://origin.imbb.forth.gr/software/ • Current stable version : β , Release 1.0.
Using the program : • Input : a .pdb file containing the model, and a formatted (ASCII) file containing h,k,l,F,σ(F). • Output : .pdb files containing the final coordinates for each model, plus a packing diagram for each solution.
Running the program (1) : $ Qs –auto 1 or, $ Qs –auto 2 etc.
Running the program (2) : ########################################################## # Target function (can be R-FACTOR, CORR-1 or CORR-2) and # number of minimisations and steps. # TARGET R-FACTOR CYCLES 5 STEPS 100000000 ############################################################ # Annealing schedule & move size control. # BOLTZMANN START 0.06800 ############################################################ # Reflection selection. # KEEP 0.70 AMPLIT_CUTOFF 1.0 SIGMA_CUTOFF 2.0 RESOLUTION 15.0 3.5 . . . . . . .
The algorithm : • Assign random initial positions & orientations to all molecules present in the asymmetric unit of the target crystal structure. Calculate Fc’s from this arrangement. • Calculate the R-factor between the Fo’s and the Fc’s. Call this Rold.
The algorithm : • Randomly chose and alter the orientation and position of one of the molecules. Calculate the R-factor resulting from the new arrangement (Rnew). • If Rnew< Rold, then, the new arrangement is accepted and we start again from (3). • If the new R-factor is worse, we still accept the move with probability exp[ –(Rnew – Rold) / T ].
The algorithm : • Randomly chose and alter the orientation and position of one of the molecules. Calculate the R-factor resulting from the new arrangement (Rnew). • If Rnew< Rold, then, the new arrangement is accepted and we start again from (3). • If the new R-factor is worse, we still accept the move with probability exp[ –(Rnew – Rold) / T ].
Speeding it up : • Avoid FFTs : calculate and store (in core) the molecular transform of the search model. • Keep a table containing the contribution of each molecule to each reflection. • CPU time per step ~ Number of reflections in P1.
Annealing schedules : • Constant temperature run. • Linear temperature gradient (slow cooling). • Boltzmann annealing (logarithmic schedule). • “Heating bath” mode.
Annealing schedules : • Constant temperature run. • Linear temperature gradient (slow cooling). • Boltzmann annealing (logarithmic schedule). • “Heating bath” mode. The temperature is automatically adjusted in such a way as to keep the fraction of moves performed against the gradient of the target function constant and equal to a user-defined value.
Temperature determination : At T=0.3125000, average R=0.59937 At T=0.1562500, average R=0.59707 At T=0.0781250, average R=0.59861 At T=0.0390625, average R=0.59028 At T=0.0195312, average R=0.58783 At T=0.0097656, average R=0.57545 At T=0.0048828, average R=0.55527 At T=0.0024414, average R=0.53016 At T=0.0012207, average R=0.52038 At T=0.0006104, average R=0.51799 At T=0.0003052, average R=0.51524
Temperature determination : At T=0.3125000, average R=0.59937 At T=0.1562500, average R=0.59707 At T=0.0781250, average R=0.59861 At T=0.0390625, average R=0.59028 At T=0.0195312, average R=0.58783 At T=0.0097656, average R=0.57545 At T=0.0048828, average R=0.55527 At T=0.0024414, average R=0.53016 At T=0.0012207, average R=0.52038 At T=0.0006104, average R=0.51799 At T=0.0003052, average R=0.51524
Temperature determination : At T=0.3125000, average R=0.59937 At T=0.1562500, average R=0.59707 At T=0.0781250, average R=0.59861 At T=0.0390625, average R=0.59028 At T=0.0195312, average R=0.58783 At T=0.0097656, average R=0.57545 At T=0.0048828, average R=0.55527 At T=0.0024414, average R=0.53016 At T=0.0012207, average R=0.52038 At T=0.0006104, average R=0.51799 At T=0.0003052, average R=0.51524
Temperature determination : At T=0.3125000, average R=0.59937 At T=0.1562500, average R=0.59707 At T=0.0781250, average R=0.59861 At T=0.0390625, average R=0.59028 At T=0.0195312, average R=0.58783 At T=0.0097656, average R=0.57545 At T=0.0048828, average R=0.55527 At T=0.0024414, average R=0.53016 At T=0.0012207, average R=0.52038 At T=0.0006104, average R=0.51799 At T=0.0003052, average R=0.51524
Temperature determination : At T=0.3125000, average R=0.59937 At T=0.1562500, average R=0.59707 At T=0.0781250, average R=0.59861 At T=0.0390625, average R=0.59028 At T=0.0195312, average R=0.58783 At T=0.0097656, average R=0.57545 At T=0.0048828, average R=0.55527 At T=0.0024414, average R=0.53016 At T=0.0012207, average R=0.52038 At T=0.0006104, average R=0.51799 At T=0.0003052, average R=0.51524
Temperature determination : At T=0.3125000, average R=0.59937 At T=0.1562500, average R=0.59707 At T=0.0781250, average R=0.59861 At T=0.0390625, average R=0.59028 At T=0.0195312, average R=0.58783 At T=0.0097656, average R=0.57545 At T=0.0048828, average R=0.55527 At T=0.0024414, average R=0.53016 At T=0.0012207, average R=0.52038 At T=0.0006104, average R=0.51799 At T=0.0003052, average R=0.51524
Move size control : Constant move size : max(Δt) = dmin/max(a,b,c) ) max(Δκ) =dmin (in degrees). Move size linearly dependent on current R-factor and time step : max(Δt) = 0.5 R (1.0 - t/ttotal ) max(Δκ) =πR (1.0 - t/ttotal )
Scaling & bulk solvent correction • The default is to scale |Fc|’s to |Fo|’s using both a scale and a temperature factor even at the relatively low resolution used for molecular replacement calculations.
Scaling & bulk solvent correction • The default is to scale |Fc|’s to |Fo|’s using both a scale and a temperature factor even at the relatively low resolution used for molecular replacement calculations. • The program implements the exponential scaling model algorithm which allows a computationally efficient and model-independent correction to be applied : Fcorrected = Fp { 1.0 – ksol exp[ -Bsol / d2 ] }
Scaling & bulk solvent correction • The default is to scale |Fc|’s to |Fo|’s using both a scale and a temperature factor even at the relatively low resolution used for molecular replacement calculations. • The program implements the exponential scaling model algorithm which allows a computationally efficient and model-independent correction to be applied : Fcorrected = Fp { 1.0 – ksol exp[ -Bsol / d2 ] }
Examples : An 11D problem. • Target structure 1lys, model 2ihl (rmsd 1.52 & 1.56Å). • Two molecules of lysozyme per asymmetric unit. • Monoclinic space group (P21), 4Å data. • ±20% noise added to error-free data. • Solutions appear after ~3.8 hours of CPU time.
Examples : A 12D problem. • Target structure 1b6q. • 30% solvent. • Search model : one poly-Alanine helix. • One monomer of Rop per a.u. • Orthorhombic space group (C2221) . • Real 15-4Å data. • About 120 minutes of CPU time per run.
Examples : A 12D problem. • Target structure 1b6q. • 30% solvent. • Search model : one poly-Alanine helix. • One monomer of Rop per a.u. • Orthorhombic space group (C2221) . • Real 15-4Å data. • About 120 minutes of CPU time per run.
Examples : A 12D problem. • Target structure 1b6q. • 30% solvent. • Search model : one poly-Alanine helix. • One monomer of Rop per a.u. • Orthorhombic space group (C2221) . • Real 15-4Å data. • About 120 minutes of CPU time per run.
Examples : A 12D problem. • Target structure 1b6q. • 30% solvent. • Search model : one poly-Alanine helix. • One monomer of Rop per a.u. • Orthorhombic space group (C2221) . • Real 15-4Å data. • About 120 minutes of CPU time per run.
Examples : A 12D problem. • Target structure 1b6q. • 30% solvent. • Search model : one poly-Alanine helix. • One monomer of Rop per a.u. • Orthorhombic space group (C2221) . • Real 15-4Å data. • About 120 minutes of CPU time per run.
Examples : A 17D problem. • Target structure 1a2p, model 2bni. • Three molecules of ribonouclease per asymmetric unit. • Trigonal space group (P32), 15-4Å data. • ±10% noise added to error-free data. • 2.5 days per run on an Intel PIII at 800MHz.
Examples : A 17D problem. • Target structure 1a2p, model 2bni. • Three molecules of ribonouclease per asymmetric unit. • Trigonal space group (P32), 15-4Å data. • ±10% noise added to error-free data. • 2.5 days per run on an Intel PIII at 800MHz.
Examples : A 23D problem. • Target structure : monoclinic form of the A31P Rop mutant containing the equivalent of one 4-α-helix bundle in the asymmetric unit (two monomers). • The structure of the orthorhombic form of the same mutant is known (1B6Q.pdb).
Examples : A 23D problem. • Target structure : monoclinic form of the A31P Rop mutant containing the equivalent of one 4-α-helix bundle in the asymmetric unit (two monomers). • The structure of the orthorhombic form of the same mutant is known (1B6Q.pdb). • We had been consistently failing to make any progress since December 1998.
Examples : A 23D problem. • Tried AMoRe & molrep using as search models individual helices, one monomer (helix-turn-helix), or the complete 4-α-helical bundle, with or without side-chains, and at various resolution ranges. • Tried X-plor and CNS with several combinations of models, data and PC-refinement protocols. • Even did an extensive heavy-atom derivative search.
Examples : A 23D problem. • Systematic search with AMoRe using one poly-Ala helix as search model : • Keep the best 750 models for the first helix (by combining the best 15 orientations with the best 50 positions). • For each of those one-helix models, search with a second helix (562,500 models). Keep only those solutions that simultaneously decrease R and increase correlation (29,638 two-helix models). • For each of these, search with a third helix (22.2 million models). Keep only those models for which the addition of the third helix both decreased R and increased correlation (273,258 models).
Examples : A 23D problem. • Systematic search with AMoRe using one poly-Ala helix as search model : • Keep the best 750 models for the first helix (by combining the best 15 orientations with the best 50 positions). • For each of those one-helix models, search with a second helix (562,500 models). Keep only those solutions that simultaneously decrease R and increase correlation (29,638 two-helix models). • For each of these, search with a third helix (22.2 million models). Keep only those models for which the addition of the third helix both decreased R and increased correlation (273,258 models). • Best R=0.583, best Corr=0.37.
Examples : A 23D problem. • Target structure : monoclinic form of 1b6q, model : one poly-Ala helix (13% of atoms). • Four helices per asymmetric unit. • Space group C2, 15-3.5Å data. • Target function 1.0-Corr(Fo,Fc) • 36 hours per run on an Intel PIII at 800MHz.
Examples : A 23D problem. • Target structure : monoclinic form of 1b6q, model : one poly-Ala helix (13% of atoms). • Four helices per asymmetric unit. • Space group C2, 15-3.5Å data. • Target function 1.0-Corr(Fo,Fc) • 36 hours per run on an Intel PIII at 800MHz.
Disadvantages : • In most cases, treating the problem as 6n-dimensional is a waste of CPU time. • You can only have one search model (ie you can not search simultaneously with your DNA & protein models). • The structure of the search model is kept fixed throughout the calculation.
Disadvantages : • The (putative) evidence from the self-rotation function and/or the native Patterson function are ignored. • When the starting model deviates significantly from the target structure, (i) there is no guarantee that the global minimum of any chosen statistic will correspond to the correct solution, (ii) traditional methods may be more sensitive in identifying the correct solution.
Advantages : • If there are just one or two molecules per asymmetric unit and CPU time is not a problem, the method can be used as a last ditch effort to conclusively show that there is no such thing as a pronounced global minimum (or otherwise ?). • The computational procedures differ so much from those used in conventional methods, that the results obtained can be considered as independent.