160 likes | 330 Views
Simple Algorithm for Sorting the Fibonacci String Rotations. Manolis Christodoulakis King’s College London Joint work with Costas S. Iliopoulos Yoan Jos é Pinz ó n Ardila. Our Goal. What makes Fibonacci strings a best case input for the Burrows-Wheeler Transform (BWT)?
E N D
Simple Algorithm for Sorting theFibonacci String Rotations Manolis Christodoulakis King’s College London Joint work withCostas S. Iliopoulos Yoan José Pinzón Ardila
Our Goal • What makes Fibonacci strings a best case input for the Burrows-Wheeler Transform (BWT)? • Relationship between different rotations of a Fibonacci string • What is their lexicographic order? • Side effect: we can deduce the symbol stored at any position of any Fibonacci string in constant time (without using , provided that the fnvalues are known) SOFSEM 2006
Fibonacci Strings & Numbers • The n-th Fibonacci string • Fn = Fn-1Fn-2 n≥2 F0=b, F1=a • The n-th Fibonacci number • fn = fn-1+fn-2 n≥2 f0=1, f1=1 F0 = b f0 = 1 F1 a 1 f1 = = F2 = a b f2 = 2 F3 = a b a f3 = 3 F4 = a b a a b f4 = 5 SOFSEM 2006
0 1 … i-1 i n-1 … Notation • The i-th rotation of a string • where i is taken modulo n. • rank(i,x) = the rank of Ri(x) • rot(ρ,x) = the rotation whose rank is ρ x = 0 1 … i-1 i n-1 … Ri(x) = SOFSEM 2006
Burrows-Wheeler Transform (BWT) • M.Burrows and D.J.Wheeler. 1994 • Purpose: to make a string more compressible • BWT Algorithm: • Create list of all rotations • Sort them • Output last symbol of every rotation • Output the rank of the 0-th rotation SOFSEM 2006
BWT on Fibonacci Strings • F5 = abaababa, f5 = 8 R0(F5) R7(F5) = = a a b a b a a a a b b a a b b a R1(F5) = b a a b a b a a R2(F5) = a a b a b a a b R2(F5) = a a b a b a a b R5(F5) = a b a a b a a b R3(F5) = a b a b a a b a R0(F5) = a b a a b a b a R4(F5) = b a b a a b a a R3(F5) = a b a b a a b a R6(F5) R5(F5) = = a b a b a a b a a b a a b a a b R6(F5) = b a a b a a b a R1(F5) = b a a b a b a a R7(F5) = a a b a a b a b R4(F5) = b a b a a b a a SOFSEM 2006
Properties of Fibonacci Strings • The number of ‘b’in Fnis fn-2 • Proof: By induction. • C.S.Iliopoulos, D.W.Moore and W.F.Smyth. 1997 • Fn = Fn-2Fn-3…F1un, un = ba (n odd) • un = ab (n even) • Let’s call this the IMS formula. SOFSEM 2006
Similarities in Rotations • R0(Fn) differs from Rfn-2(Fn) in 2 symbols • Proof: • R0(Fn) = Fn-2Fn-3…F1un • Rfn-2(Fn) = Fn-3…F1unFn-2 (1) • R0(Fn) =Fn-1Fn-2 • = Fn-3…F1un-1Fn-2 (2) • Ri(Fn) differs from Ri+fn-2(Fn) in 2 symbols • Proof: • Ri(Fn) = Ri(R0(Fn)) • Ri+fn-2(Fn) = Ri(Rfn-2(Fn)) SOFSEM 2006
Relative Order of Rotations • Ri(Fn) < Ri+fn-2(Fn) for n odd, i fn-1-1 • Proof: • R0(Fn) = Fn-3…F1un-1Fn-2 • Rfn-2(Fn) = Fn-3…F1unFn-2 • For i=fn-1-1: • Ri(Fn) = bFn-2Fn-3…F1a • Ri+fn-2(Fn)= aFn-2Fn-3…F1b • Similarly, • Ri(Fn) > Ri+fn-2(Fn) for n even, i fn-1-1 = Fn-3 … F1 ab Fn-2 = Fn-3 … F1 ba Fn-2 SOFSEM 2006
Sorted List of Rotations • We proved (n odd): • Ri(Fn) < Ri+fn-2(Fn) i fn-1-1 (3) • We will now prove that there is no j s.t. • Ri(Fn) < Rj(Fn) < Ri+fn-2(Fn) • Proof: (constructive) • Start at i=fn-1 and construct the partial list • Ri Ri+fn-2Ri+2fn-2 Ri+3fn-2… Ri+kfn-2… • for as long as • i+kfn-2 fn-1-1 (mod fn) kfn-1 • I.e. the list is complete! SOFSEM 2006
Identify Rotation (i) by Rank (ρ) • Therefore, for n odd: • rot(ρ,Fn) = fn-1 • = (ρfn-2-1) mod fn • Similarly, for n even, the sorted list is constructed bottom-up giving • rot(ρ,Fn) = (-(ρ+1)fn-2-1) mod fn +ρfn-2) mod fn ( SOFSEM 2006
Identify Rank (ρ) of a Rotation (i) • This is simply the inverse of the previous function • n odd • rank(i,Fn) = ((i+1)fn-2) mod fn • n even • rank(i,Fn) = ((i+1)fn-2-1) mod fn SOFSEM 2006
Symbols of Fibonacci Strings • Fn[i] = ? • Observe that • Fn[i] = Ri(Fn)[0] • In the sorted list of rotations, the first fn-1rotations start with ‘a’, the rest with ‘b’ • Thus Fn[i] can be deduced from rank(i,Fn) If rank(i,Fn) ≤ fn-1then Fn[i]=a else b. SOFSEM 2006
BWT & Fibonacci ― The Quick Way • The first fn-2 symbols of BWT are ‘b’ • Proof: (n odd) • We proved the first fn-2 rotations have index • (ρ·fn-2-1)modfnfor 0 ≤ ρ < fn-2 • The last symbol of these rotations is • Fn[ (ρ·fn-2-1 )modfn ] • Which for 0 ≤ ρ < fn-2is ‘b’ • The next fn-1 symbols of BWT are ‘a’ • Proof: Consequence of previous lemma +fn-1 SOFSEM 2006