1 / 46

What’s new and Automation developments in CCP4

What’s new and Automation developments in CCP4. Ronan Keegan CCP4, STFC Daresbury Laboratory, U.K. Quick Overview. Brief introduction to CCP4 New programs and features in CCP4 Upcoming features in version 6.1 Automation projects MrBUMP – automated Molecular Replacement

cara
Download Presentation

What’s new and Automation developments in CCP4

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. What’s new and Automation developments in CCP4 Ronan Keegan CCP4, STFC Daresbury Laboratory, U.K.

  2. Quick Overview • Brief introduction to CCP4 • New programs and features in CCP4 • Upcoming features in version 6.1 • Automation projects • MrBUMP – automated Molecular Replacement • Other automation projects

  3. What is CCP4? • Collaborative Computational Project Number 4 • Set up in the late 70’s to support collaboration between researchers working on Protein Crystallography software in the UK and to assemble a comprehensive collection of software to satisfy the computational requirements of the relevant UK groups. • Many functions: • Support and distribution of the CCP4 suite of programs for PX • Education – workshops, university visits, summer schools, study weekend • Maintaining the CCP4 bulletin board and website • Academic users can use the suite for free. Licence fee for commercial users

  4. CCP4 Organisational Structure Occasional Contributors WG 2 WG 1 Steering Committees Exec Project Leader Lots of other useful software e.g. PDBExtract DL CCP4 Group Core developments & activities Funded Developers Associated Developers Core projects e.g : CCP4mg, mmdb, PIMS, Automation, BIOXHIT … Major programs e.g: Mosflm, Refmac, Scala, Phaser, Clipper, Coot … STAB

  5. New programs and features in CCP4 • New Packages in CCP4 6.0: • CCP4mg – Molecular Graphics • Coot – graphical toolkit for model building, model completion and validation • Phaser – molecular replacement (version 1.3.3) • Chainsaw – MR model preparation • Pirate: statistical phase improvement • Superpose: secondary structure alignment • BP3: heavy atom phasing and refinement • Chooch: anomalous scattering factors from raw fluorescence spectra • New features in CCP4i

  6. CCP4mg • The aim is to provide a molecular graphics program that is fully compatible with the CCP4 environment and programs. • Features: • Displays molecules with simple, flexible selection tools and a variety of display styles and colouring schemes. • A simple graphical interface to select the atoms to display, the colour scheme and the display style. • Surfaces and electrostatic potential calculations • Displays maps with a 'continuous crystal' and real time update of contouring level.

  7. Superpose two or more protein structures automatically. Also structure analysis: secondary structure, solvent accessible surface area, hydrogen bonds, close contacts. • Writes 'snapshot' images, create movies. Also creates POV-Ray input files and PostScript files. • Runs on Linux and Windows (2000, NT and XP) and Mac OSX.

  8. Normal mode Analysis • CCP4MG can currently perform approximate normal mode calculations using two elastic network models. • Only consider one atom per residue (CA) • Assume all force constants to be the same • Gaussian Network and Anisotropic Network methods employed

  9. Coot • Coot is for model building, model completion and validation. • It will display maps and models and allows model manipulations such as idealization, real space refinement, manual rotation/translation, rigid-body fitting, ligand search, solvation, mutations, rotamers, and Ramachandran plots. • File formats handled: PDB, mmCIF, MTZ files, Phases (.phs) and others. • Most of its functions are also accessible for scripting. http://www.ysbl.york.ac.uk/~emsley/coot/

  10. Coot

  11. Phaser • Phaser is a program for phasing macromolecular crystal structures with maximum likelihood methods. Version 1.3.3 in CCP4 6.0.2 supports the molecular replacement method. The next version will include the experimental phasing method. • Features: • brute- force rotation and translation searches • FFT- based fast rotation and translation searches • correction for anisotropic diffraction • search for multiple molecules in multiple space groups http://www-structmed.cimr.cam.ac.uk/phaser/

  12. Pirate & Superpose • Pirate: • Pirate is a new statistical phase improvement program. • 'pirate' performs statistical phase improvement by classifying the electron density map by sparseness/denseness and order/disorder, with the aim of obtaining superior results to conventional solvent mask based methods without requiring knowledge of the solvent content. • Currently available for Linux and MAC OSX. • Superpose: • superpose aligns two structures by matching graphs built on the protein's secondary-structure elements, followed by an iterative three-dimensional alignment of protein backbone C-alpha atoms.

  13. BP3 • BP3 is a new program for obtaining phase information from an S/MIR(AS) and/or S/MAD experiment(s) by multivariate likelihood estimation. • It will refine heavy and/or anomalously scattering atomic parameters along with error parameters to generate phase information.

  14. Chooch • Program to determine what wavelengths to use to do your MAD experiment. • Determines values of anomalous scattering factors from raw fluorescence spectra and pinpoints the position of the f'' maximum and the f' minimum values. • Command line driven with all options controlled by switches. • Optional PGPLOT visual output. • Publication quality PS output generated on request.

  15. Chainsaw • Molecular replacement model preparation utility that mutates a template PDB file according to a sequence alignment. • Features: • examines the sequence alignment between target and template and modifies the template PDB file by pruning non-conserved residues back to the gamma atom • more atoms are preserved than in a polyalanine model, but parts of the model which are unlikely to be present in the crystal structure and thus would only degrade the signal are pruned. 1mr6 used as a template for 1tgx (38% sequence identity). From left to right: unmodified template, chainsaw template, polyalanine template.

  16. New features in CCP4i • Interfaces for new programs: • Phaser, • Pirate/Clipper, • BP3, • Chainsaw, • CCP4mg launcher, • CRANK, • Shelx_C/D/E.

  17. New features in CCP4i • Database search and sort • Project shortcuts • Customise job database view • Help shortcuts

  18. CCP4 6.1 and beyond • Version 6.1 in 6-12 months time • New Programs for 6.1 • Rapper – Protein modelling, automated conformer generation • Rampage - generate Ramachandran plots for structure validation • Buccaneer – chain tracing • Pointless – determine space/laue group from umerged data • Oasis • Crunch2 • Afro • Clipper2 libraries • Automation scripts • MrBUMP • XIA2

  19. iMosflm • New improved mosflm graphical user interface. • More user friendly than the old one.

  20. Updates to popular CCP4 programs • Acorn • ab initio procedure for the determination of protein structure using atomic resolution data or artificially extended data to atomic resolution, and for finding sub-structures from anomalous or isomorphous differences. • Truncate (Uboat) • New improved version written in C++. • In the longer term there will be new tests for twinning, anisotropy corrections and the ability to handle unmerged data (useful if radiation damage occurs), but these won't be in the initial release. • Phaser 2.0/2.1 • Will include experimental phasing • Refmac 5.3/6.0 • The latest version of Refmac, and will supersede the version 5.2.x in the CCP4 6.0.x series.

  21. CCP4 6.1 and beyond • Plans for CCP4i • CCP4i Classic reworked • CCP4i Auto – automation scripts • CCP4i database • New database handler • Allow for greater flexibility and control of jobs • Job/DB viewer program built on top of the DB (more about this later)

  22. CCP4 6.1 and beyond • Long term plans • Better integration between CCP4i, CCP4mg and Coot • More intuitive interfaces to programs • More automation

  23. CCP4 Automation • Reasons • Higher throughput at synchrotron beamlines • Crystallography is increasingly becoming a tool for researchers in other fields. Not all have the time to learn how to use the complex set of programs for solving structures. Users prefer to concentrate on the Biology

  24. MrBUMP - Molecular Replacement with Bulk Model Preparation

  25. Aim of MrBUMP • Automated framework for Molecular Replacement • Particular emphasis on generating variety of search models • Wraps Phaser, Molrep and Acorn • Uses a variety of helper applications (eg Chainsaw) and bioinformatics tools (eg FASTA, Mafft) • Uses on-line databases (eg PDB, Scop) • Can make use of computational cluster resources to speed up the processing • In favourable cases, gives “one-button” solution • In unfavourable cases, suggests likely search models for manual investigation

  26. Pipeline ` Target MTZ & Sequence Target Details ` Template Search ` Model Preparation Check scores and exit or select the next model ` Molecular Replacement & Refinement

  27. Template Search • Sequence based search (FASTA) • Secondary structure based search (SSM) • Domain search (SCOP) • Identification of possible multimers (PQS & PISA) • Users can also enter their own templates by ID or from locally held files.

  28. Model Preparation • Search models can be prepared for MR in several ways • Chainsaw – non-conserved residues are pruned (sequence provided) • Molrep – pruning of non-conserved side-chains (internal sequence alignment) • Polyalanine – all side chain atoms are pruned beyond the CB atom • PDBclip – models are not modified • An ensemble of the best models is also created for Phaser

  29. Molecular Replacement & Refinement • For each search model, MR is done with Molrep or Phaser or both. • MR programs run mostly with defaults • MrBUMP provides LABIN columns, MW of target, sequence identity of search model, number of copies to search for, number of clashes tolerated • Allow Molrep / Phaser to set resolution limits and weights • After MR, models are passed to Refmac for restrained refinement  final Rfree < 0.35 or final Rfree < 0.5 and dropped by 20% “success”  final Rfree < 0.48 or final Rfree < 0.52 and dropped by 5% “marginal”  “failure” otherwise

  30. MrBUMP and cluster computing • MrBUMP is usually run on a desktop from ccp4i or the command line • However, MrBUMP can take advantage of a compute cluster to farm out the Molecular Replacement jobs. • Currently Sun Grid Engine enabled clusters are supported but support will be added for other types of queuing system (e.g. LSF, Condor) if there is enough demand. • Job control: All nodes terminate when one finds a solution • Current (known) cluster installations at Daresbury, Diamond and University of Dundee.

  31. MrBUMP on the Grid • Currently under development • Large parameter space searches. Submit many jobs to U.K. computational grid resources using recently developed e-Science tools (MCS, AgentX, Rcommands, SRB) • Goals: • To improve the performance/success rate of the method • Possibly extract useful Biological information • Make grid-enabled version available to users

  32. MrBUMP Output • Currently produces a long log file listing search results, model preparation steps, summaries from each MR and refinement job and relevant references for programs used. • Not ideal, there’s a lot of information totrawl through. Summary of results now provided at the end of log file. • Future versions will provide results in marked-up web page format for more clarity.

  33. MrBUMP Output – CCP4i dbviewer

  34. MrBUMP pre-release • Beta version first released in Jan’ 06 (current version is 0.3.3) • Currently supported on Linux and Mac OSX, Windows version will be available when included in suite. • Will be included in next release of CCP4 (version 6.1) • MrBUMP paper to be published in Acta Cryst. D in April ‘07 • First citations in Obiero et al., Acta Cryst. (2006). F62, 757-760; El Omari et al., Acta Cryst. (2006). F62, 949-953 http://www.ccp4.ac.uk/MrBUMP/

  35. New features • Run Acorn after refinement for phase improvement (high resolution data) • Support for searching in enantiomorphic spacegroups. • Users can now specify template models by PDB ID or add local PDB files. • “Generate models only” option. • XML Output. • Additional multiple alignment programs supported – Tcoffee and Probcons.

  36. Future versions • Improvements to multimeric search models (using PISA) • Supplement multiple alignment with additional sequences and/or structural information • Model completion and/or re-building • Target complexes. • Improved output presentation

  37. Conclusions • Test cases and the examples demonstrated the utility of trying a range of search models, a protocol that can only be attempted adequately by automation. • MrBUMP is not meant to compete with careful analysis of the data and model by an experienced crystallographer. However, it may succeed in difficult cases by finding a combination of models and protocols that would not otherwise have been tried. • In more straight forward cases the advantage is simply one of convenience.

  38. CCP4 Automation - BALBES • Authors: Garib Murshudov, Alexei Vagin, Fei Long (YSBL) • Built around Molrep MR and model preparation, Refmac and Sfcheck • Model preparation based on using a custom database derived from the PDB database • Best model is derived from the database and used in Molrep. • Protocols • Simple molecular replacement • Domains iterated with refinement • Use of tertiary structure if available • Completion of MR using phased MR and refinement • Released early 2007

  39. XIA2 Automated Data Reduction • xia2 is a new automated data reduction system designed to work from raw diffraction data and a little metadata, and produce usefully reduced data in a form suitable for immediately starting phasing and structure solution. • Pre-release version is currently available. http://www.ccp4.ac.uk/xia/

  40. XIA2 BEGIN PROJECT TM1553 BEGIN CRYSTAL 13185 BEGIN AA_SEQUENCE MHKMWPSDSNDHRVTRRNVIIFSSLLLGSLAILLALLLIRTKDQYYELRDFALGTSVRIV VSSQKINPRTIAEAILEDMKRITYKFSFTDERSVVKKINDHPNEWVEVDEETYSLIKAAC AFAELTDGAFDPTVGRLLELWGFTGNYENLRVPSREEIEEALKHTGYKNVLFDDKNMRVM VKNGVKIDLGGIAKGYALDRARQIALSFDENATGFVEAGGDVRIIGPKFGKYPWVIGVKD PRGDDVIDYIYLKSGAVATSGDYERYFVVDGVRYHHILDPSTGYPARGVWSVTIIAEDAT TADALSTAGFVMAGKDWRKVVLDFPNMGAHLLIVLEGGAIERSETFKLFERE END AA_SEQUENCE BEGIN HA_INFO ATOM SE NUMBER_PER_MONOMER 5 END HA_INFO BEGIN WAVELENGTH INFL WAVELENGTH 0.97950 F' -12.1 F'' 5.8 END WAVELENGTH INFL BEGIN WAVELENGTH LREM WAVELENGTH 1.00000 F' -2.5 F'' 0.5 END WAVELENGTH LREM BEGIN SWEEP INFL WAVELENGTH INFL BEAM 109.0 105.0 IMAGE 13185_2_E1_001.img DIRECTORY /data/jcsg/als1/8.2.1/20050121/collection/TM1553/13185/ END SWEEP BEGIN SWEEP LREM WAVELENGTH LREM BEAM 109.0 105.0 IMAGE 13185_2_E2_001.img DIRECTORY /data/jcsg/als1/8.2.1/20050121/collection/TM1553/13185/ END SWEEP END CRYSTAL 13185 END PROJECT TM1553 • Requires image data + input specification script with target and experiment data: • Sequence • Number of heavy atoms • Wavelength • Location of image data

  41. Through your favourite phasing pipeline…

  42. CCP4 Automation - HAPPy – Heavy Atom Phasing in Python • What it is: • Automated Experimental Phasing Pipeline • Replaces and expands on the capabilities of the CHART package • What it will do: • Take integrated and merged experimental data amplitudes (post-TRUNCATE),de-twinned,consistently indexed. • Determine the heavy atom structure and phase probabilities. • Optimize the density map to give interpretable map. • Build structure. • First release will handle SAD data only.MAD,MIR,MIRAS modes later. http://www.ccp4.ac.uk/HAPPy

  43. Acknowledgements: • Core Group (Daresbury): • Martyn Winn, Charles Ballard, Peter Briggs, Francois Remacle, Norman Stein, Wendy Yang, Maeri Howard. • CCP4MG (York): • Liz Potterton, Stuart McNicholas • Coot (Oxford & York): • Paul Emsley, Kevin Cowtan • Program Developers (York, Cambridge, Diamond & Leiden University): • Garib Murshudov, Alexei Vagin, Fei Long, Randy Read, Airlie McCoy, Harry Powell, Gwyndaf Evans, Phil Evans, Eleanor Dodson, Nick Furnham, Steve Ness. • BBSRC for their funding • And many others…

More Related