240 likes | 566 Views
A Simpler 1.5-Approximation Algorithm for Sorting by Transpositions. Tzvika Hartman Weizmann Institute. Genome Rearrangements. During evolution, genomes undergo large-scale mutations which change gene order (reversals, transpositions, translocations).
E N D
A Simpler 1.5-Approximation Algorithm for Sorting by Transpositions Tzvika Hartman Weizmann Institute
Genome Rearrangements • During evolution, genomes undergo large-scale mutations which change gene order (reversals, transpositions, translocations). • Given 2 genomes, GR algs infer the most economical sequence of rearrangement events which transform one genome into the other.
Genome Rearrangements Model • Chromosomes are viewed as ordered lists of genes. • Unichromosomal genome, every gene appears once. • Genomes are represented by unsigned permutations fo genes. • Circular genomes (e.g., bacteria & mitochondria) are represented by circular perms.
1 2 6 73 4 5 8 9 Sorting by Transpositions • A transposition exchanges between 2 consecutive segments of a perm. • Example : 1 2 3 4 56 7 8 9 Sorting by transpositions: finding ashortest sequence of transpositions which sorts the perm.
Previous work • 1.5-approximation algs for sorting by transpositions [BafnaPevzner98, Christie99]. • An alg that sorts every perm of size nin at most 2n/3 transpositions [Erikkson et al 01]. • Complexity of the problem is still open.
Main Results • The problem of sorting circular permutations by transpositions is equivalent to sorting linear perms by transpositions. • A new and simple 1.5-approximation alg for sorting by transpositions, which runs in quadratic time.
Linear & Circular Perms A transposition “cuts” the perm at 3 points. t A B C D A C B D Linear transposition : A A t Circular transposition : B C C B • Circular transpositions can be represented by exchanging any 2 of the 3 segments.
Linear & Circular Equivalence • Thm : Sorting linear perms by transpositions is computationally equivalent to sorting circular perms. • Pf sketch: Circularize linear perm by adding an n+1element and closing the circle. Пn+1 Пn П1 П1 . . . Пn . . . . . • Every linear transposition is equivalent to a circular transposition that exchanges the 2 segments that do not include n+1.
4 1 3 2 6 11 5 12 14 9 13 10 8 7 Breakpoint Graph [BafnaPevzner98] Perm : ( 1 6 5 4 7 3 2 ) Replace each element j by 2j-1,2j: = (1 2 11 12 9 10 7 8 13 14 5 6 3 4) Circular Breakpoint graph G(): Vertex for every element. Black edges (2i, 2i+1) Grey edges (2i, 2i+1)
4 1 3 2 6 11 5 12 14 9 13 10 8 7 Breakpoint Graph (Cont.) • Unique decomposition into cycles. • codd(): # of odd cycles in G(). • Define Δcodd(,t) = codd(t · ) – codd() • Lemma[BP98]: t and , Δcodd(,t){0, 2, -2}.
1 2 4 5 3 6 Effect on Graph : Example • Perm: (1 3 2). • After extension: (1 2 5 6 3 4). • Breakpoint graph: 1 2 4 5 3 6 • # of cycles increased by 2
Effect on Graph : Example • Perm : (6 5 4 3 2 1). • After extension : (11 12 9 10 7 8 5 6 3 4 1 2). • Breakpoint graph : 11 12 11 12 9 2 9 2 1 10 1 10 7 7 4 4 3 8 3 8 6 6 5 5 • # of cycles remains 2
Breakpoint Graph (Cont.) • Max # of odd cycles, n, is in the id perm, thus: • Lower bound[BP98]: For all , d() [n-codd()]/2. • Goal : increase # of odd cycles in G. • t is a k-transposition if Δcodd(,t) = k. • A cycle that admits a 2-transposition is oriented.
Simple Permutations • A perm is simple if its breakpoint graph contains only short (3) cycles. • The theory is much simpler for simple perms. • Thm : Every perm can be transformed into a simple one, while maintaining the lower bound. Moreover, the sorting sequence can be mimicked. • Corr : We can focus only on simple perms.
3 - Cycles • 2 possible configurations of 3-cycles: Non-oriented 3-cycle Oriented 3-cycle
(0,2,2)-Sequence of Transpositions • A (0,2,2)-sequence is a sequence of 3 transpositions: the 1st is a 0-transposition and the next two are 2-transpositions. • A series of (0,2,2)-sequences preserves a 1.5 approximation ratio. • Throughout the alg, we show that there is always a 2-transposition or a (0,2,2)-sequence.
Interleaving Cycles • 2 cycles interleave if their black edges appear alternatively along the circle. • Lemma : If G contains 2 interleaving 3-cycles, then a (0,2,2)-sequence.
Shattered Cycles • 2 pairs of black edges intersect if they appear alternatively along the circle. • Cycle A is shattered by cycles B and C if every pair of black edges in A intersects with a pair in B or with a pair in C. • Lemma : If G contains a shattered cycle, then a (0,2,2)-sequence.
Shattered Cycles (Cont.) • Lemma : If G contains no 2-cycles, no oriented cycles and no interleaving cycles, then a shattered cycle.
The Algorithm • While G contains a 2-cycle, apply a 2-transposition [Christie99]. • If G contains an oriented 3-cycle, apply a 2-transposition on it. • If G contains a pair of interleaving 3-cycles, apply a (0,2,2)-sequence. • If G contains a shattered unoriented 3-cycle, apply a (0,2,2)-sequence. • Repeat until perm is sorted.
Conclusions • We introduced 2 new ideas which simplify the theory and the alg: • Working with circular perms simplifies the case analysis. • Simple perms avoid the complication of dealing with long cycles (similarly to the HP theory for sorting by reversals).
Open Problems • Complexity of sorting by transpositions. • Models which allow several rearrangement operations, such as trans-reversals, reversals and translocations (both signed & unsigned).
Acknowledgements • Ron Shamir.