200 likes | 274 Views
Doug Raiford Lesson 18. Protein Structure Searches. Problem definition. Given a protein conformation can we find other structurally similar proteins? Might have a database of structures (like the PDB). If have a predicted and known…. Can do a simple RMSD to compare the two conformations
E N D
Doug Raiford Lesson 18 Protein Structure Searches Protein Structure Searches
Problem definition • Given a protein conformation can we find other structurally similar proteins? • Might have a database of structures (like the PDB) Protein Structure Searches
If have a predicted and known… • Can do a simple RMSD to compare the two conformations • Know precisely which aa’s compare to which Protein Structure Searches
What about if not identical sequences? • Must map aa’s from one to aa’s in the other • How might you do this? • Sequence similarity • MSA’s Protein Structure Searches
Have we seen before? • 3D PSSM • Sequence alignment integrated with 3D alignment • Stored in profile (position specific similarity profile) • Gens 1D profiles first (MSAs) • Then uses a structural alignment program (SAP) to augment profiles with structural similarity Protein Structure Searches
SAP (structural alignment program) • Aligning secondary structures Protein Structure Searches
How? • What do you think of when you hear that you will need to align two things? • Dynamic programming Protein Structure Searches
Scoring • Three components • AA similarity (substitution matrix) • Local structure • E.g. both aa’s members of alpha helix • Solvent exposure Are the associated AA’s similar, sequence wise (i.e. both glycines)? Are they both in a similar local structure? Are they both buried or both exposed to solvent? Protein Structure Searches
Benefits • SAP (structure alignment) allows a profile to be influenced by secondary structure • Useful to 3D PSSM in thatthreading decisions (whichaa’s match to a profile) • Homology based protein conformation enhancedby making better decisions on where to insert gaps/varying length loops Protein Structure Searches
Another already seen • PFAM • Have Markov Models for protein families • Sequences that match models have high probability of matching conformation • Even though not comparing structures (query to target) • are matching a sequence to its most probable structure Pfam HMMR Protein Structure Searches
What about similar structure in an alternative way? • Can’t really align • How else might it work? Protein Structure Searches
Dali (distance matrix alignment) • How might two distance matrices look? • All pair wise distances from each aa to all other aa’s • If identical proteins the matrices would be almost identical Low distance region if hair pin (anti-parallel) Low distance region in matrix if parallel Protein Structure Searches
How turn into a similarity score? • Find optimum set of similar sub-structures • Even if in different 1D locations • Find amino acid equivalence • Once have equivalence can easily compare structure similarity • E.g. with RMSD Protein Structure Searches
Approach • Break matrix into a bunch of overlapping sub-matrices • Do an all pair wise comparison • Sub-matrices are merged that naturally extend • Must find pairings of sub-matrices that yield best overall score Protein Structure Searches
How optimize choice of pairings • Monte Carlo approach • Randomly generate pairings • Calculate overall similarity • Multiple solutions in parallel • Slowly improve each by randomly altering pairings (like a random search) • Have some probability of keeping a solution that is worse than previous Protein Structure Searches
Once have aa associations… • Can determine similarity • How? Protein Structure Searches
Have to minimize aa distances • Must perturb XYZ (translation), pitch, and yaw (rotation) of one of the proteins minimizing RMSD • Like linear regression • Can’t do until know which aa’s are associated Protein Structure Searches
Have to minimize aa distances • Some numeric methods start by fixing between 2 and 4 amino acids • Some short cuts • Center of gravity is the average of all vectors • Translate • ave(p1) – ave(p2) • Singular value decomposition to rotate (Like Eivenvectors) Protein Structure Searches
Score more complex so… • Requires double dynamic programming • If nxm matrix then n times m different matrices generated pinning return path to each aa pair • Used to generate a position specific scoring which is then used in aa similarity scoring • Reduces the constraint that two particular aa’s are equivalent … Protein Structure Searches