1 / 22

A New Algorithm for Protein Folding in the HP Model

A New Algorithm for Protein Folding in the HP Model. Alantha Newman Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms, pp. 876-884, 2002 Created by: Chia-Chang Wang Date: Feb. 25, 2005. Abstract.

Download Presentation

A New Algorithm for Protein Folding in the HP Model

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.


Presentation Transcript

  1. A New Algorithm for Protein Folding in the HP Model Alantha Newman Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms, pp. 876-884, 2002 Created by: Chia-Chang Wang Date: Feb. 25, 2005

  2. Abstract • We consider the problem of protein folding in the HP model on the two-dimensional square lattice. This problem is combinatorially equivalent to fold a string of 0's and 1's so that the string forms a self-avoiding walk on the lattice and the number of adjacent pairs of 1's is maximized. We present a linear-time 1/3-approximation algorithm for this problem, improving on the previous best approximation factor of 1/4. The approximation guarantee of this algorithm is based on an upper bound used in all previous papers that address this problem.

  3. Some Notations 1) S = [s1,s2,s3…sn]. • 2) Odd-1: 1 in an odd index.Even-1: 1 in an even index. • 3) For every substring of S, s=[sj,sj+1…sk], O[s]: number of odd-1’s in s.E[s]: number of even-1’s in s.

  4. Assumptions • The length of S is even. If not we can add an extra 0 at the beginning or end of the string. • 2) The number of odd-1’s == even-1’s. if one appears more than the other, we can arbitrarily change 1’s to 0’s (or 0’s to 1’s).

  5. e e e e o o o o e e e o o o o e e e e e o o o o e e e o o o o e e e e o o o o 1 e e e o o o o e e e e e o o o o e e e o o o o e Some Lemmas and Proofs On the square lattice, an even-1 can only be adjacent to odd-1 and vice versa.

  6. Some Lemmas and Proofs(Cont.) Lemma 2.1: If L will be the closed loop of S (made by joining the endpoints), L will have at least the same number of contacts as S. • Proof: any folding of L broken in the any point will become a folding of a string.

  7. Some Lemmas and Proofs(Cont.) Lemma 2.2: If any loop L has E[L]==O[L] (equal number of odd-1’s and even-1’s), there is an element si, such that if we go around L to one direction, O[si, si+1, …]  E[si, si+1, …].

  8. Some Lemmas and Proofs(Cont.) Proof 2.2: Let S=s1, s2 … sn. by making a new function f(i)=O[s1…si]-E[s1…si].Finding the point p, in which f(i) is at it’s minimum is done in linear time complexity. • By considering p as a starting point of loop L,we will get new function f’(i) that is non-negative. ( reasoning: point p is the point in which the difference between odd-1’s and even-1’s is the greatest, by choosing it as a starting point we zero the function f(p) }

  9. p=6 Some Lemmas and Proofs(Cont.) Proof 2.2: for example the string110101101011. • (E[L]=O[L] => f(0)=f(n)=0) • p will always be even: • assume p is odd. If sp+1 =1, f(p+1)<f(p). contradiction to assumption. • If sp+1 =0, f(p+1)=f(p). p+1 is even. pick it as p.

  10. Sp Some Lemmas and Proofs(Cont.) By making the new string s’=s’1, s’2…s’n, such that s’1 = sp+1… (cyclic rotation) (That’s why we need p to be even – not to switch odd-1’s and even-1’s) s’=101011110101 • By going from left to right By going from right to left • f(i)>=0 f(i)<=0

  11. P Bo(3) Bo(2) Be(1) S = 11010110100011 Additional Notations • Bo(i) : substring following the i-1 odd-1 to and including i odd-1. • Be(i) : substring following the i-1 even-1 to and including i even-1.

  12. p p-1 p+1 p-2 The Algorithm • Starting point: There are 4 cases: (a) |Be(i)|=2 & |Bo(j)|=2i<=i+2, j<=j+2 (there are 3 contacts)

  13. The Algorithm(Cont.) (b) |Be(i) |>2 & |Bo(j)|>2 i<=i+2, j<=j+2 (there are 3 contacts)

  14. The Algorithm (Cont.) (c) |Be(i)|>2 & |Bo(j)|=2 i<=i+1, j<=j+2 (there are 2 contacts)

  15. The Algorithm (Cont.) (d) |Be(i)|=2 & |Bo(j)|>2 i<=i+2, j<=j+1 (there are 2 contacts)

  16. The Algorithm (Cont.) Contacts Count: (a, b, c-d) Case (a,b): 3 contacts for every 2 odd-1’s. Case (c-d): 4 contacts for every 3 odd-1’s. Unpaired (c) cases: 1 contact for every odd-1. • * Cases (a, b, c-d): at least 4 contacts for every 3 odd-1’s.

  17. Analysis Theorem: The algorithm finds at least M/3 contacts, i.e a 1/3 approximation. (M = min{ O[S], E[S] } = O[S] = E[S]) • The ‘K’ Assumption: • Assume k more cases (c) folds than case (d) folds. • If not we will count even-1’s contacts. • (* therefore: E[p-2, p-3, … i*] = O[p+1, p+2, … j*] - k ) • O[p+1, p+2, … j*] = number of odd-1’s in contacts,O[p-2, p-3, … i*] = number of odd-1’s not necessarily in contacts

  18. O[S] = O[p+1, p+2, … j*] + O[p-2, p-3, … i*] Analysis (Cont.) • By Lemma 2.2: (cyclic move) O[p-2, p-3, … i*] <= E[p-2, p-3, … i*]

  19. O[S] <= O[p+1, p+2, … j*] + E[p-2, p-3, … i*] Analysis (Cont.) • By Lemma 2.2: (cyclic move) O[p-2, p-3, … i*] <= E[p-2, p-3, … i*] • By The ‘K’ Assumption: (previous page) • E[p-2, p-3, … i*] = O[p+1, p+2, … j*] - k

  20. O[S] <= 2  O[p+1, p+2, … j*] - k Analysis (Cont.) • By Lemma 2.2: (cyclic move) O[p-2, p-3, … i*] <= E[p-2, p-3, … i*] • By The ‘K’ Assumption: (previous page) • E[p-2, p-3, … i*] = O[p+1, p+2, … j*] - k

  21. O[S] k 2 2 • O[p+1, p+2, … j*] >= + • 2O[S] 3 • Contacts = (O[p+1, p+2, … j*] - 2k) + 2k • 43 • 43 • M 3 • O[S] 3k 2 2 • >= ( - ) + 2k = = Analysis (Cont.) • By Lemma 2.2: (cyclic move) O[p-2, p-3, … i*] <= E[p-2, p-3, … i*] • By The ‘K’ Assumption: (previous page) • E[p-2, p-3, … i*] = O[p+1, p+2, … j*] - k • Total Contacts:

  22. Total Algorithm Time Complexity is: O(n). Analysis (Cont.) • Time Complexity: • Point p can be found at O(n), • Going over the string in both directions, and folding it is proportional to the total length of the string, O(n).

More Related