1 / 20

An Introduction to Protein Fold Recognition

An Introduction to Protein Fold Recognition. Protein Fold Recognition and Threading Algorithms. Folding vs. Prediction. Folding Determining the way in which a polypeptide really fold (biophysics approach) Prediction

wirt
Download Presentation

An Introduction to Protein Fold Recognition

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Introduction to Protein Fold Recognition Protein Fold Recognition and Threading Algorithms

  2. Folding vs. Prediction Folding • Determining the way in which a polypeptide really fold (biophysics approach) Prediction • Determining secondary, tertiary, or quaternary structure given a polypeptide sequence (computational approach)

  3. MPGAVEG.….GGTGTDS Primary Structure Tertiary Structure Prediction Task

  4. Assumptions in Prediction Anfinsen’s Dogma • Tertiary structure is determined by amino acid interactions and the surrounding medium (i.e. polar effects from water and ions) • a.k.a. Thermodynamic Hypothesis

  5. Types of Prediction Algorithms 1. Ab initio (thermodynamic optimization) 2. Homology Modelling (sequence-similarity) 3. Fold Recognition(sequence-structure similarity)

  6. Definition of Fold Recognition • Given • A database of known 3D structures mapped to a concise format (templates, folds) • A primary sequence of unknown tertiary structure • Find • The database structure with the best global sequence-structural alignment (threading)

  7. Related Definitions • Inverse Folding • Given a 3D structure find all primary sequences which are likely to fold to it. • Threading • Align one polypeptide to one structure optimally (global optimal sequence-structural alignment) • The core of many fold recognition systems

  8. Motivation for Fold Recognition • Proteins have about 1000 structural families (est.) • Over 16,000 primary sequences in PDB alone. • Fold recognition produces a first-order approximation of structure.

  9. Fold Recognition Algorithm Mapping 3D structures to 1D profile • We have a set of 3D structures • How do we represent them as a 1D string (for threading) ? • Map each residue to an environmental class

  10. Fold Recognition Algorithm Mapping 3D structures to 1D profile • Environmental classes • {B1, B2, B3, P1, P2, E} x {, , } • Assign each residue an environmental class by • Area of side chain buried by protein atoms • Area of side chain exposed to polar atoms • Local secondary structure (in -helix, -sheet ?)

  11. Fold Recognition Algorithm Build 3D Structure Profile Matrix

  12. Fold Recognition Algorithm • Compatibility Search • Align unknown polypeptide sequence (probe sequence) to each 1D profile • Thread probe onto each profile • Scoring function is the profile matrix • Use dynamic programming

  13. Alternative Scoring Functions • Learn as a neural network • Input: Arguments to scoring function • Output: score • Knowledge Based Potentials (Sippl) • include intraresidue affects

  14. PROSPECT • Variant on earlier fold recognition alg’m • Divides each template in database into core and loop regions • Optimizes using energy function:

  15. PROSPECT • Divide and Conquer • Subdivide templates s.t. each core has its own region • To merge cores A and B • Find subsequences a and b in probe and do ungapped alignment to A and B • Align loop between A and B to subsequence between a and b • Total alignment is the sum of the above scores

  16. PROSPECT • Optimal (!) • Reasonably fast • Can include domain knowledge in energy function • Does well with remote homologs

  17. Alternative Threaders • Branch and Bound • Double Dynamic Programming • Monte Carlo • Heuristic Search

  18. Results • More accurate and faster than ab initio • Can detect relations homology search cannot • Structure is preserved more than sequence • Doolittle’s “twilight zone” • PROSPECT does well even in “twilight zone”

  19. ?

  20. END

More Related