TEXTAL: A System for Automated Model Building Based on Pattern Recognition

TEXTAL: A System for Automated Model Building Based on Pattern Recognition Thomas R. Ioerger Department of Computer Science Texas A&M University

Main Stages of TEXTAL electron density map CAPRA build-in side-chain and main-chain atoms locally around each CA C-alpha chains Reciprocal-space refinement/ML DM LOOKUP example: real-space refinement model (initial coordinates) Human Crystallographer (editing) Post-processing routines model (final coordinates)

F=<1.72,-0.39,1.04,1.55...> F=<1.58,0.18,1.09,-0.25...> F=<0.90,0.65,-1.40,0.87...> F=<1.79,-0.43,0.88,1.52...>

CAPRA:C-Alpha Pattern Recognition Algorithm

Overview of CAPRA • goal: predict CA chains from density map • not just “tracing” - more than Bones • desire 1:1 correspondence, ~3.8A apart • based on principles of pattern recognition • use neural net to estimate which pseudo-atoms in trace “look” closest to true C-alphas • use feature extraction to capture 3D patterns in density for input to neural net • use other heuristics for “linking” together into chains, including geometric analysis (s.s.)

CAPRA: C-Alpha Pattern-Recognition Algorithm • Tracer - remove lattice points from map (lowest density first) without breaking connectivity • Neural nework - for each pseudo atom, extract features, input to network, predict distances to CAs (1:10 in trace), trained on example points in real maps • Linking - desire long chains, good CA predictions (not in side-chains), “structurally plausible” (e.g. linear, helical) Density Trace Neural Network Linking into C-alpha chains map pseudo atoms predictions of distance to true CA C-alpha coordinates

Steps in CAPRA

Examples of CAPRA Steps

Tracer + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

Neural Network

Feature Extraction • characterize 3D patterns in local density • must be “rotation invariant” • examples: • average density in region • standard deviation, kurtosis... • distance to center of mass • moments of inertia, ratios of moments • “spoke angles” • calculated over spheres of 3A and 4A radius

Forward Propagation: Backward Propagation:

Selection of Candidate C-alpha’s • method: • pick candidates in order of lowest predicted distance first, • among all pseudo-atoms in trace, • as long as not closer than 2.5A • notes: • no 3.8A constraint; distance can be as high as 5A • don’t rely on branch points (though often near) • picked in random order throughout map • initially covers whole map, including side-chains and disconnected regions (e.g. noise in solvent)

Linking into Chains • initial connectivity of CA candidates based on the trace • “over-connected” graph - branches, cycles... • start by computing connected components (islands, or clusters) • two strategies: • for small clusters (<=20 candidates), find longest internal chain with “good” atoms • for large clusters (>20 candidates), incrementally clip branch points using heuristics

Extracting Chains from Small Clusters • exhaustive depth-first search of all paths • scoring function: • length • penalty for inclusion of points with high predicted distance to true CA by neural net • preference for following secondary structure (locally straight or helical)

Secondary Structure Analysis • generate all 7-mers (connected fragments of candidate CAs of length 7) • evaluate “straightness” • ratio of sum of link lengths to end-to-end distance • straightness>0.8 ==> potential beta-strand • evaluate “helicity” • average absolute deviation of angles and torsions along 7-mer from ideal values (95º and 50º) • helicity<20 ==> potential alpha-helix

Handling Large Clusters • start by breaking cycles (near “bad” atoms) • clip links at branch points till only linear chains remain • clip the most “obvious” links first, e.g. • if other two links are part of sec. struct. • if clipped branch has “bad” atom nearby • if clipped branch is small and other 2 are large ? ? ?

Example of CA-chains for CzrA fit by CAPRA

Results for MVK

Results

Availability • Textal web site: • http://textal.tamu.edu:12321 • server-side processing • free access to Capra • beta-testing of Textal • To contact us, email: textal@tamu.edu

Acknowledgements • Funding • National Institutes of Health • Welch Foundation • People • Dr. James C. Sacchettini • The rest of the TEXTAL Group: • Tod Romo • Kreshna Gopal • Reetal Pai

TEXTAL: A System for Automated Model Building Based on Pattern Recognition

TEXTAL: A System for Automated Model Building Based on Pattern Recognition

Presentation Transcript

Applying Corpus Based Approaches using Syntactic Patterns and Predicate Argument Relations to Hypernym Recognition for Q

Optical Music Recognition

Medical Imaging and Pattern Recognition

A learning-based transportation oriented simulation system

Response of the Innate Immune System to Pathogens: Pattern Recognition Receptors

Model Building Training

Pattern Recognition

Unified Modeling Language

From Pattern Formation to Phase Field Crystal Model

Enhancing Instruction of Written East Asian Languages with Sketch Recognition-Based “Intelligent Language Workbook” Inte

Speech Recognition Chapter 3

Local Invariant Feature Descriptors

COMPE 467 - Pattern Recognition

Chapter 8 Building the Analysis Model (2) Analysis Modeling

Fuzzy System

Design Patterns

Design and Implementation of Speech Recognition Systems

PCI 6 th Edition

Chapter 12: Multiple Regression and Model Building

Transitioning to a Standards-Based System in Maine

Introduction to Face Recognition and Detection

Neural Networks