140 likes | 156 Views
Automated Model-Building with TEXTAL. Thomas R. Ioerger Department of Computer Science Texas A&M University. Overview of TEXTAL. Automated model-building program Can we automate the kind of visual processing of patterns that crystallographers use?
E N D
Automated Model-Building with TEXTAL Thomas R. Ioerger Department of Computer Science Texas A&M University
Overview of TEXTAL • Automated model-building program • Can we automate the kind of visual processing of patterns that crystallographers use? • Intelligent methods to interpret density, despite noise • Exploit knowledge about typical protein structure • Focus on medium-resolution maps • optimized for 2.8A (actually, 2.6-3.2A is fine) • typical for MAD data (useful for high-throughput) • other programs exist for higher-res data (ARP/wARP) Electron density map (not structure factors) Protein model (may need refinement) TEXTAL
Main Stages of TEXTAL electron density map CAPRA build-in side-chain and main-chain atoms locally around each Ca Reciprocal-space refinement/DM Ca chains LOOKUP example: real-space refinement model (initial coordinates) Human Crystallographer (editing) Post-processing routines model (final coordinates)
CAPRA: C-Alpha Pattern-Recognition Algorithm tracing Neural network: estimates which pseudo-atoms are closest to true Ca’s linking
Example of Ca-chains fit by CAPRA Rat a2 urinary protein (P. Adams) data: 2.5A MR map generated at 2.8A % built: 84% # chains: 2 lengths: 47, 88 RMSD: 0.82A
Stage 2: LOOKUP • LOOKUP is based on Pattern Recognition • Given a local (5A-spherical) region of density, have we seen a pattern like this before (in another map)? • If so, use similar atomic coordinates. • Use a database of maps with known structures • 200 proteins from PDB-Select (non-redundant) • back-transformed (calculated) maps at 2.8A (no noise) • regions centered on 50,000 Ca’s • Use feature extraction to match regions efficiently • feature (e.g. moments) represent local density patterns • features must be rotation-invariant (independent of 3D orientation) • use density correlation for more precise evaluation
Examples of Numeric Density Features Distance from center-of-sphere to center-of-mass Moments of inertia - relative dispersion along orthogonal axes Geometric features like “Spoke angles” Local variance and other statistics TEXTAL uses 19 distinct numeric features to represent the pattern of density in a region, each calculated over 4 different radii, for a total of 76 features.
F=<1.72,-0.39,1.04,1.55...> F=<1.58,0.18,1.09,-0.25...> F=<0.90,0.65,-1.40,0.87...> F=<1.79,-0.43,0.88,1.52...>
The LOOKUP Process Find optimal rotation Database of known maps Region in map to be interpreted
Interfaces for Using TEXTAL • Stand-alone commands and scripts • capra-scale prot.xplor prot-scaled.xplor • neotex.sh myprotein > textal.log • lots of intermediate files and logs… • WINTEX: Tcl/Tk interface • creates jobs in sub-directories • Public Release: July 2004 • http://textal.tamu.edu:12321 • Integrated into Phenix • http://phenix-online.org • Python module • model-building tasks in GUI
Conclusions • Pattern recognition is a successful technique for macromolecular model-building • Future directions: • building ligands, co-factors, etc. • recognizing disulfide bridges • phase improvement (iterating with refinement) • loop-building • further integration with Phenix • Intelligent Agent-based methods for guiding/automating model-building • interactive graphics for specialized needs (e.g. fixing chains, editing identities)
Acknowledgements • Funding: • National Institutes of Health • People: • James C. Sacchettini • Kevin Childs, Kreshna Gopal, Lalji Kanbi, Erik McKee, Reetal Pai, Tod Romo • Our association with the PHENIX group: • Paul Adams (Lawrence Berkeley National Lab) • Randy Read (Cambridge University) • Tom Terwilliger (Los Alamos National Lab)