390 likes | 519 Views
Genome Rearrangements. Unoriented Blocks. Quick Review. Looking at evolutionary change through reversals Find the shortest possible series of reversals that transform gene A into gene B It has been shown that this results in an NP-Hard problem. Oriented Blocks. 1 2 3 4 5 5 2 1 3 4.
E N D
Genome Rearrangements Unoriented Blocks
Quick Review • Looking at evolutionary change through reversals • Find the shortest possible series of reversals that transform gene A into gene B • It has been shown that this results in an NP-Hard problem
Oriented Blocks 1 2 3 4 5 5 2 1 3 4 1 2 3 4 5 1 2 5 4 3 1 2 5 3 4 5 2 1 3 4
Unoriented Blocks • Orientation of the blocks in the genomes is unknown 2 1 3 7 5 4 8 6 1 2 3 4 5 6 7 8
Definitions • unoriented permutation - a mapping from {1,2,…,n} to a set L of n labels. • reversal – reverses the order of a segment of consecutive labels.
Definitions (cont.) • reversal distance – if p1,p2,…pt is a shortest series of reversals such thatαp1p2…pt = β , t is the reversal distance of α with respect to β, denoted by dβ(α)
Example 1 Figure below shows two chromosomes with homologous blocks 2 1 3 7 5 4 8 6 1 2 3 4 5 6 7 8 • Assign labels 1 through 8 to the blocks in the lower chromosome • Transfer the labels to the upper chromosome giving equal labels to homologous blocks • We obtain a starting permutation in the upper chromosome and our goal is to sort it into the lower one, the identity
Example 1 (cont.) 2 1 3 7 5 4 8 6 1 2 3 7 5 4 8 6 1 2 3 4 5 7 8 6 1 2 3 4 5 7 6 8 1 2 3 4 5 6 7 8
Best Solution? • How do we know that this is the shortest series of reversals? • To decide what the reversal distance should be, we look at the breakpoints
Breakpoints • A breakpoint of an unoriented permutation α is a pair of labels adjacent in α but not in the target. • In the case of the identity, this means adjacent labels that are not consecutive.
Example 2 Assume the identity is the target… Breakpoints with oriented blocks: L 5 2 1 3 4 R Breakpoints with unoriented blocks: L 5 2 1 3 4 R
Example 2 (cont.) L 2 1 3 7 5 4 8 6 R • b(α) denotes the number of breakpoints of α • a reversal can remove at most two breakpoints hence: d(α) > ( b(α) / 2 ) where d(α) is the reversal distance • using this rule, we see that d(α) > 4 for the above example
Strips L 4 5 3 2 1 R If we have two adjacent labels that do not make a breakpoint, they must be of the form: …x(x+1) or …x(x-1)
Strips (cont.) • strip – a sequence of consecutive labels surrounded by breakpoints but with no internal breakpoints • Two types of strips: increasing decreasing
Special Rules • A single label surrounded by breakpoints is said to be a strip that is both increasing and decreasing • L and R are always considered part of an increasing strip, even if they are by themselves • L and R are considered a single element for the purpose of defining strips. If 0, 1, … is a strip and …, n, n+1 is a strip, we consider these two sequences as a single strip. They are linked by the common element L = R.
Example 3 L 1 2 8 7 3 5 6 4 R Strips increasing: (R,L,1,2) (5,6) decreasing: (8,7) both: (3) (4)
Theorem 1 If label k belongs to a decreasing strip and k - 1 belongs to an increasing strip, then there is a reversal that removes at least one breakpoint L 4 5 2 3 1 7 6 R k-1 k
Proof • Labels k – 1 and k must belong to different strips, since only single elements are said to be both increasing and decreasing. • The above statement implies that each one is the last element in its strip (each is followed by a breakpoint).
Proof (cont.) Two possible schemes: … (k - 1) … k … … k … (k - 1) … Performing a reversal on the area between the breakpoints brings k and k-1 together, reducing the number of breakpoints by at least one.
Example 4 L 4 5 2 3 1 7 6 R L 4 5 2 3 1 7 6 R L 4 5 6 7 1 3 2 R L 4 5 6 7 1 3 2 R k-1 k
Observations • All permutations have at least one increasing strip (L or R) • All permutations do not necessarily have a decreasing strip • If there is a decreasing strip, the previous proof shows that there is a breakpoint-removing reversal
Theorem 2 If label k belongs to a decreasing strip and k + 1 belongs to an increasing strip, then there is a reversal that removes at least one breakpoint. L 5 4 2 3 1 6 7 R k k+1
Proof Two possible schemes: (k + 1) … k … k … (k + 1) … Performing a reversal on the area between the breakpoints brings k and k+1 together, reducing the number of breakpoints by at least one.
Example 5 L 5 4 2 3 1 6 7 R L 5 4 2 3 1 6 7 R L 1 3 2 4 5 6 7 R L 1 3 2 4 5 6 7 R k k+1
The Result • The two proofs just explained show that, as long as we have decreasing strips, we can always reduce the number of breakpoints. • Notice that this also applies to single-element strips • What about when there are no decreasing strips?
Theorem 3 Let α be a permutation with a decreasing strip. If all reversals that remove breakpoints from α leave no decreasing strips, then there is a reversal that removes two breakpoints from α.
Proof • Let k be the smallest label involved in a decreasing strip. • p is the reversal uniting k and k - 1 • k – 1 must be to the left of k, otherwise p leaves a decreasing strip. … (k – 1) … k …
Proof (cont.) • Let ℓ be the largest label involved in a decreasing strip. • σ is the reversal uniting ℓ and ℓ + 1 • ℓ + 1 must be to the right of ℓ, otherwise σ leaves a decreasing strip … ℓ… (ℓ + 1) …
Proof (cont.) • Observe that k must be inside the interval reversed by σ, otherwise σ would leave k ’s decreasing strip intact. • Likewise, ℓ must belong to the interval of p … (k – 1) ℓ … k (ℓ + 1) …
Proof (cont.) … (k – 1) ℓ … k (ℓ + 1) … • We can see that p = σ must be true • The reversal removes two breakpoints because k is united with k – 1 and ℓ is united with ℓ + 1
Example 6 L 7 8 3 5 4 6 1 2 R Reversals that remove breakpoints L 7 8 3 5 4 6 1 2 R L 7 8 3 4 5 6 1 2 R k-1 ℓ + 1 ℓ k
Sorting a Permutation • We can use an algorithm that sorts a permutation using at most 2 * d(α) reversals (that is, twice as many reversals as the minimum possible) • Algorithm assumes that the target is the identity (1,2,3,4….)
General Idea • A main loop looks at the current permutation and selects the best possible reversal to apply • Update the current permutation and report the reversal applied • The loop stops when the current permutation is the identity
Choosing the Reversal s • If there is a decreasing strip, look for a reversal that reduces the number of breakpoints and leaves a decreasing strip. • If no such reversal exists, there is a reversal that encompasses all the decreasing strips and removes two breakpoints. • If there are no decreasing strips, select a reversal that cuts two breakpoints.
Sorting Algorithm L 1 2 . 8 7 . 3 . 5 6 . 4 . R list empty k 3 p (8 7 3) αp = L 1 2 3 . 7 8 . 5 6 . 4 . R α αp list (8 7 3) k 4 p (7 8 5 6 4) αp = L 1 2 3 4 . 6 5 . 8 7 . R α αp list (8 7 3), (7 8 5 6 4) k 5 p (6 5) αp = L 1 2 3 4 5 6 . 8 7 . R α αp list (8 7 3), (7 8 5 6 4), (6 5) k 7 p (8 7) αp = L 1 2 3 4 5 6 7 8 R α αp list (8 7 3), (7 8 5 6 4), (6 5), (8 7) Algorithm:Sorting Unoriented Permutation input: permutation α output: series of reversals that sort α list empty whileα != I do ifα has a decreasing strip then k smallest label in a decreasing strip p reversal that cuts after k and after k-1 ifαp has no decreasing strip then ℓ largest label in a decreasing strip p reversal that cuts before ℓ and before ℓ+1 else p reversal that cuts the first two breakpoints α αp list list+p return list
Another Example L . 2 1 . 3 . 7 . 5 4 . 8 . 6 . R list empty k 1 p (2 1) αp = L 1 2 3 . 7 . 5 4 . 8 . 6 . R α αp list (2 1) k 4 p (7 5 4) αp = L 1 2 3 4 5 . 7 8 . 6 . R α αp list (2 1), (7 5 4) k 6 p (7 8 6) αp = L 1 2 3 4 5 6 . 8 7 . R α αp list (2 1) , (7 5 4) , (7 8 6) k 7 p (8 7) αp = L 1 2 3 4 5 6 7 8 R list (2 1), (7 5 4), (7 8 6), (8 7)
But is it Optimal? It has been shown: d(α) > ( b(α) / 2 ) For the previous example: b(α) = 7 d(α) >= 4 Although the algorithm produces the optimal result in this instance, it is not guaranteed to do so. The algorithm may produce a list containing more reversals than are actually necessary to solve the problem.
Theorem 4 The number of iterations in algorithm Sorting Unoriented Permutation is less than or equal to the number of breakpoints in the initial permutation
Proof • Must prove that, on average, each iteration removes at least one breakpoint. • We can see this is true because the only time we remove 0 breakpoints, is immediately after we have removed 2, keeping the average of 1 breakpoint per iteration intact.