140 likes | 148 Views
This text explores how Syd influenced the author's career in computational crystallography, with a focus on Hall symbols and their impact on space group representation. It also discusses the development of CIF and Python in crystallography, as well as the contributions of the PHENIX project.
E N D
Not retired: Hall symbols + CIF or How Syd influenced my life without me noticing it. Ralf W. Grosse-Kunstleve Computational Crystallography InitiativeLawrence Berkeley National Laboratory White-Hall Retirement Symposium, July 16/17, 2007
My connections to Syd • PhD project: Zeolite structure determination from powder data using extracted intensities • focus • sginfo: Hall symbols • Contributions to Xplor/CNS • Joined Axel Brunger’s group encouraged by Syd (ECM 1995) • Single-crystal protein crystallography • About 80% of all PDB entries refined with Xplor/CNS • CNS PDB deposition via mmCIF files • Phenix project • Automation of protein structure determination • Fresh start after losing a legal battle • The P in Phenix is for Python: recommended by Syd
Concise space group symbols • Symbols for crystallographic space groups • Designed to overcome limitations of the familiar Hermann-Mauguin symbols (e.g. P 21 21 21) • H-M symbols are defined in International Tables for Crystallography • Created by a generation of scientist that didn’t have computers • Define the space group type uniquely • But not the exact setting: inadequate for automatic processing • Attempts to add a rule set to H-M symbols leads to complicated algorithms (never standardized) • H-M symbols cover only a very limited subset of settings that appear, e.g. in the generation of group-subgroup relations
Hall (1981) symbols • Symbols for crystallographic space groups • Designed to overcome limitations of the familiar Hermann-Mauguin symbols (e.g. P 21 21 21) • H-M symbols are defined in International Tables for Crystallography • Created by a generation of scientist that didn’t have computers • Define the space group type uniquely • But not the exact setting: inadequate for automatic processing • Attempts to add a rule set to H-M symbols leads to complicated algorithms (never standardized) • H-M symbols cover only a very limited subset of settings that appear, e.g. in the generation of group-subgroup relations
Hall symbols • Designed for automatic processing • No ambiguities • Via attached transformation symbols, any setting of any crystallographic space group can be represented (Int. Tab. Vol. B, 2001) • Applications • Determination of space group type • Automatic determination of allowed origin shifts • Automatic group-subgroup processing • Automatic derivation of twin laws • Primitive setting of centered space groups • Reduces memory requirements
Translation table HallHermann-Mauguin /* 081 */ " P -4", /* P -4 */ /* 082 */ " I -4", /* I -4 */ /* 083 */ "-P 4", /* P 4/m */ /* 084 */ "-P 4c", /* P 42/m */ /* 085 */ "-P 4a", /* P 4/n :2 */ /* 086 */ "-P 4bc", /* P 42/n :2 */ /* 087 */ "-I 4", /* I 4/m */ /* 088 */ "-I 4ad", /* I 41/a :2 */ /* 089 */ " P 4 2", /* P 4 2 2 */ /* 090 */ " P 4ab 2ab", /* P 4 21 2 */ /* 091 */ " P 4w 2c", /* P 41 2 2 */
STAR + CIF • Situation before CIF • Vast, diverse variety of data formats • Need to reformat data all the time is a real impediment to scientific progress • What does “gof” mean? • What does it mean exactly? • STAR: Self-defining Text Archival and Retrieval format • Hall, S. R., "The STAR File: A New Format for Electronic Data Transfer and Archiving," J. Chem. Inf. Comput. Sci. 31, 326-333 (1991). • Defines format • Framework for defining semantics • CIF (1991) • Based on STAR, defines semantics • Similar in concept to XML schema, but a decade ahead • CIF is de-facto standard in small molecule crystallography • Macromoleclar community has more difficulties with the semantics part of CIF
Python • Situation: forced end of CNS development • Legal reasons • Axel Brunger wanted Paul Adams and me to continue methods development in a different way • We started exploring the world around us • We wanted a scripting language like CNS with additional compiled components • IUCr meeting Glasgow 1999 • Watching solar eclipse with Syd • BTW: use Python
The PHENIX project A collaboration between several groups Computational Crystallography Initiative (LBNL) • Paul Adams, Ralf Grosse-Kunstleve, Pavel Afonine • Nigel Moriarty, Nicholas Sauter, Peter Zwart Los Alamos National Lab (LANL) • Tom Terwilliger, Li-Wei Hung Cambridge University • Randy Read, Airlie McCoy Texas A&M University • Tom Ioerger, Jim Sacchettini, Erik McKee Duke University • Jane Richardson, David Richardson, Ian Davis CCI APPS SOLVE / RESOLVE PHASER TEXTAL MolProbity / REDUCE Funding: NIH Program Project (NIGMS, PSI), Director - Paul Adams
Spectrum of phenix components • Automated analysis of data quality: phenix.xtriage • Rapid substructure determination: phenix.hyss • Phasing: Maximum likelihood – SOLVE, PHASER for SAD • Density modification: Statistical density modification (RESOLVE) • Automated model building: • Pattern matching methods (RESOLVE or TEXTAL) • Structure refinement: phenix.refine (likelihood, annealing, TLS) • Advanced automation: AutoSol – hkl to map • Ligand building and fitting: eLBOW, AutoLigand • Validation and Hydrogens: MolProbity + Reduce
phenix.refine - Restrained refinement (xyz, iso/aniso ADP) - Automatic water picking - Bond density - Unrestrained refinement • FFT or direct summation • Hydrogens - Group ADP refinement - Rigid body refinement - Automatic NCS restraints - Simulated Annealing - Occupancies (individual, group) - TLS refinement - Twinned data • X-ray, Neutron, joint X-ray + Neutron refinement
Refinement flowchart PDB model, Any data format (CNS, Shelx, MTZ, …) Input data and model processing Refinement strategy selection Bulk-solvent, Anisotropic scaling, Twinning parameters refinement Ordered solvent (add / remove) Target weights calculation Coordinate refinement (rigid body, individual) (minimization or Simulated Annealing) ADP refinement (TLS, group, individual iso / aniso) Occupancy refinement (individual, group) Output: Refined model, various maps, structure factors, complete statistics Repeated several times Files for COOT, O, PyMol
Summary • Hall symbols open the door to automatic processing of crystallographic symmetry • STAR + CIF enable automation of data flow between researchers and archives • Our experience with Python suggests: it is a good idea to listen to Syd! • Syd’s wisdom! … Gnu Xtal System, available as an open source project, … but with the expectation that Sourceforge may be its twilight resting place, afore it's entombed by the sands of time, right alongside The King of Kings.
Acknowledgments • Syd • for shining the light in the right direction • for showing me Perth • Phenix developers • P.D. Adams • P. Afonine • T.R. Ioerger • A.J. McCoy • E.W. McKee • N.W. Moriarty • R.J. Read • N.K. Sauter • J.N. Smith • L.C. Storoni • T.C. Terwilliger • P.H. Zwart • Funding: • LBNL (DE-AC03-76SF00098) • NIH/NIGMS (1P01GM063210) • PHENIX Industrial Consortium