70 likes | 239 Views
Counting the maximum number of base pairs. But how to count? An RNA could be very long; there may be many possible ways that base pairs can be formed: e.g., …… A C G G U A C G U C ….. conflicting pairs A - U, A - U G - C, G - C etc.
E N D
Counting the maximum number of base pairs • But how to count? An RNA could be very long; there may be many possible ways that base pairs can be formed: e.g., ……ACGGUACGUC….. conflicting pairs A-U, A-U G-C, G-C etc. Even the number of non-conflicting combinations of base pairs is exponentially large.
j i (1) head paired with tail (2) tail is unpaired (3) head is unpaired (4) two subfolds j i k ab initio structure prediction (cont’)
looking at shorter (e.g., very short) subsequences in a long sequence ACGGU…ACGUC • For subsequences of length 1, A, C, G, G, U, …, A, C, G, U, C #of base pairs 0, 0, 0, 0, 0, …, 0, 0, 0, 0, 0 • For subsequences of length 2, AC, CG, GG, GU, …, AC, CG, GU, UC # 0, 1. 0, 1, …, 0, 1, 1, 0 • For subsequence of length 3, ACG, CGG, GGU, …, UAC, ACG, CGU, GUC, UUC ?: e.g., GUC (1) G-C + U --> 1+0 =1 head-tail (2) G + UC --> 0+0 =0 head unpaired (3) GU + C --> 1+0 =1 tail unpaired (4) GU + C --> 1+0 =1 split (5) G + UC --> 0+0 =0 split
examine a little longer sequence …..ACGGUACGU….. i j ==> max of {cases 1, 2, 3, 4} • Head-tail paired, count = 1 + max count in subsequence CGGUACG i+1 j-1 2. Head unpaired, count = max count in subsequence CGGUACGU i+1 j • Tail unpaired, count = max count in subsequence ACGGUACG i j-1 • Split (why needed and where to split ?) ACGGUACGU when k=i+2 i j ==> ACG + GUACGU <---- k ---> count = max count in ACG + max count in GUACGU
simple model: (i, j) = 1 Ab initio structure prediction (cont’) • Maximizing the number of base pairs (Nussinov et al, 1978)
G G G A A A U C C 0 0 0 0 0 0 1 2 3 0 0 0 0 0 1 2 3 0 0 0 0 1 2 2 GAAAUC 0 0 0 1 1 1 0 0 1 1 1 0 1 1 1 0 0 0 0 0 0 G G G A A A U C C GGGAAAUCC Ci,j = 0 when i=j AAUC AU
Example 2: ACGGUU subsequence of length 0: empty sequence, 0 pairs subsequences of length 1: A, C, G, G, U, U 0 0 0 0 0 0 pairs subsequences of length 2: AC, CG, GG, GU, UU 0 1 1 0 0 pairs subsequences of length 3: ACG, CGG, GGU, GUU 1 1 1 1 pairs Subsequences of length 4: ACGG, CGGU, GGUU 1 2 2 pairs Subsequences of length 5: ACGGU, CGGUU 2 2 pairs subsequence of length 6: ACGGUU 3 pairs