1 / 15

Reusing phenix.refine for powder data?

Reusing phenix.refine for powder data?. Ralf W. Grosse-Kunstleve Computational Crystallography Initiative Lawrence Berkeley National Laboratory Workshop on developments and directions of powder diffraction on proteins, June 22/23, 2007. My two lives. Live 1 (PhD project):

mahon
Download Presentation

Reusing phenix.refine for powder data?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Reusing phenix.refine for powder data? Ralf W. Grosse-Kunstleve Computational Crystallography InitiativeLawrence Berkeley National Laboratory Workshop on developments and directions of powder diffraction on proteins, June 22/23, 2007

  2. My two lives • Live 1 (PhD project): • Zeolite structure determination frompowder data using extracted intensities • Live 2: • Contributions to Xplor/CNS • Single-crystal protein crystallography • About 80% of all PDB entries refined with Xplor/CNS • Phenix project • Fresh start after losing a legal battle

  3. Computational Crystallography Initiative (LBNL) • Paul Adams, Ralf Grosse-Kunstleve, Pavel Afonine • Nigel Moriarty, Nicholas Sauter, Peter Zwart Los Alamos National Lab (LANL) • Tom Terwilliger, Li-Wei Hung Cambridge University • Randy Read, Airlie McCoy Texas A&M University • Tom Ioerger, Jim Sacchettini, Erik McKee Duke University • Jane Richardson, David Richardson, Ian Davis Phenix Collaboration CCI APPS SOLVE / RESOLVE PHASER TEXTAL MolProbity / REDUCE Funding: NIH Program Project (NIGMS, PSI), Director - Paul Adams

  4. Spectrum of phenix components • Automated analysis of data quality: phenix.xtriage • Rapid substructure determination: phenix.hyss • Phasing: Maximum likelihood – SOLVE, PHASER for SAD • Density modification: Statistical density modification (RESOLVE) • Automated model building: • Pattern matching methods (RESOLVE or TEXTAL) • Structure refinement: phenix.refine (likelihood, annealing, TLS) • Advanced automation: AutoSol – hkl to map • Ligand building and fitting: eLBOW, AutoLigand • Validation and Hydrogens: MolProbity + Reduce

  5. phenix.refine - Restrained refinement (xyz, iso/aniso ADP) - Automatic water picking - Bond density - Unrestrained refinement • FFT or direct summation • Hydrogens - Group ADP refinement - Rigid body refinement - Automatic NCS restraints - Simulated Annealing - Occupancies (individual, group) - TLS refinement - Twinned data • X-ray, Neutron, joint X-ray + Neutron refinement

  6. Refinement flowchart PDB model, Any data format (CNS, Shelx, MTZ, …) Input data and model processing Refinement strategy selection Bulk-solvent, Anisotropic scaling, Twinning parameters refinement Ordered solvent (add / remove) Target weights calculation Coordinate refinement (rigid body, individual) (minimization or Simulated Annealing) ADP refinement (TLS, group, individual iso / aniso) Occupancy refinement (individual, group) Output: Refined model, various maps, structure factors, complete statistics Repeated several times Files for COOT, O, PyMol

  7. Designed to be very easy to use Refinement of individual coordinates and B-factors: % phenix.refinemodel.pdbdata.hkl Same as above plus water picking: % phenix.refinemodel.pdbdata.hkl ordered_solvent=true Run with parameter file: % phenix.refinemodel.pdbdata.hkl parameter_file refinement.main { high_resolution = 2.0 simulated_annealing = True ordered_solvent = True number_of_macro_cycles = 5 } refinement.refine.adp { tls = chain A tls = chain B }

  8. How to best make ends meet? • GSAS & proteins • Extending a small-molecule powder program to deal with proteins • Advantage: program designed for the field • Community used to inputs, outputs, idiosyncrasies • Disadvantage: some approaches suitable for small molecules don’t scale • Direct-summation structure factor calculation • Neighborhood calculations (nonbonded interactions, a.k.a. anti-bumping restraints) • phenix.refine • Extending a single-crystal protein program to deal with powders • Advantage: program designed to deal with large structures • Protein, RNA/DNA restraint libraries, optimized algorithms • Disadvantage: new data formats, differences in terminology

  9. Two main challenges • Challenge 1: • Input/output of powder-specific format • Fundamentally trivial but potentially tedious • New command? • No interference with existing, non-trivial algorithms for automatic recognition, processing, and consolidation of already very heterogeneous inputs • Extend the existing input algorithms? • Nicer, but requires higher degree of collaboration • Challenge 2: • Development of a powder-specific target function • Based on extracted intensities or primary pattern + pre-fitted profile parameters? • Maximum likelihood with or without cross-validation? • Will probably require some refactoring of the refinement engine

  10. Modular design • Application level • phenix wizards (data in, structure out) • phenix.refine • phenix.hyss (hybrid substructure search) • Visible source • Library level • cctbx project, organized in modules • libtbx, scitbx, cctbx, iotbx, mmtbx • cctbx is intended to cover small-molecule work • But nothing yet specific to powders • Unrestricted open source

  11. Existing target functions • Least-squares (variety) • Maximum likelihood on amplitudes • Maximum likelihood with experimental phases • Least-squares twin target • SAD-specific maximum likelihood target implemented in Phaser • Reusing target from external application! • Dirty laundry • Severe code duplication in implementation of twin target • Needs to be consolidated • Some friction integrating the Phaser ML-SAD target • Phaser target relatively slow: we need better bookkeeping to avoid repeated calculations with exactly the same input

  12. Precedence for reusing cctbx? • cctbx used heavily by all phenix collaborators • Phaser uses cctbx -> cctbx supported by CCP4 6.0 and up • smtbx: small-molecule toolbox • Group at Durham University, U.K. collaborating with David Watkin at Oxford University, U.K. • Long-term goal: highly integrated single-crystal structure determination (direct methods), automatic model building and refinement • Initial focus: iterative model building and refinement • Initial approach: reuse + adjust cctbx core libraries directly combined with copying sub-modules to smtbx where they are modified • Long term: consolidate duplications as much as possible • half the code = half the bugs, reuse of optimizations

  13. Summary of ideas • Implement powder-specific target function(s) that plug into the refinement engine in the open source cctbx libraries • Can be done stand-alone using ad-hoc input/output methods • Collaborate in making the necessary adjustments to the existing libraries • Figure out the best way to handle input/output at the application level • Learn and re-evaluate as we go • If the powder field joins in there will be the potential for direct cross-fertilization between three specializations in crystallography • Single-crystal protein • Single-crystal small-molecule • Powder diffraction protein • More? (powder diffraction small-molecule) • cctbx libraries are very general • Ever increasing integration is the secret behind the stunning successes in the development of computing technology • Can we make this idea work in crystallography?

  14. Availability • Phenix incl. Graphical User Interface • http://www.phenix-online.org/ • Freely available to academic (non-profit) groups • Core libraries (cctbx) • http://cctbx.sourceforge.net/ • Freely available to all

  15. Acknowledgments • Phenix developers • P.D. Adams • P. Afonine • T.R. Ioerger • A.J. McCoy • E.W. McKee • N.W. Moriarty • R.J. Read • N.K. Sauter • J.N. Smith • L.C. Storoni • T.C. Terwilliger • P.H. Zwart • Funding: • LBNL (DE-AC03-76SF00098) • NIH/NIGMS (1P01GM063210) • PHENIX Industrial Consortium

More Related