190 likes | 309 Views
Sorting and searching in the presence of memory faults (without redundancy). Irene Finocchi Giuseppe F. Italiano DISP, University of Rome “Tor Vergata” {finocchi,italiano}@disp.uniroma2.it. 80. A. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 20. 4. 9. 10. 2. 3. B. 11. 12.
E N D
Sorting and searching in the presence of memory faults (without redundancy) Irene Finocchi Giuseppe F. Italiano DISP, University of Rome “Tor Vergata” {finocchi,italiano}@disp.uniroma2.it
80 A 1 2 3 4 5 6 7 8 9 10 ... ... 11 12 13 20 4 9 10 2 3 B 11 12 13 14 15 16 17 18 19 20 Q(n) Q(n) 80 Out Q(n2) inversions The problem • Large, inexpensive and error-prone memories • Classical algorithms may not be correct in the presence of (even very few) memory faults An example: merging two ordered lists
At any time At any memory location • Fault appearance Simultaneously Faulty-memory model • Memory fault = the correct value stored in a memory location gets altered (destructive faults) • Faulty Random Access Machine: • O(1) words of reliable memory • Corrupted values indistinguishable from correct ones • Fault-tolerant algorithms = able to get a correct output (at least) on the set of uncorrupted values
The liar model: comparison questions answered by a possibly lying adversary[Ulam 77, Renyi 76] W(n log n + k n)[Lakshmanan et al., IEEE TOC 91] At most k lies O(n log n)fork = O (log n / log log n) [Ravikumar, COCOON 02] Probabilistic model Q(n log (n/q)), correct with probability (1-q) [Feige et al., SICOMP 94] Linearly bounded model Exponentiallower bound [Borgstrom & Kosaraju, STOC 93] Algorithms can exploit query replication strategies Lies Transient failures Related work
E.g., with respect to sorting: Q1. Can we sort the correct values in the presence of, e.g., polynomially many memory faults? Q2. How many faults can we toleratein theworst case if we wish to maintain optimal time and space? Why not data replication? Data replication can be quite inefficient in certain highly dynamic scenarios, especially if objects to be replicated are large and complex What can we do without data replication?
Can we sort (at least) the correct values on O(n log n) time and optimal space in the presence of, e.g., polynomially many memory faults? Q1. We show an algorithm resilient up to O((n log n)1/3) memoryfaults A fault tolerant algorithm • Based on mergesort • Main difficulty: merging step
faithfully ordered = ordered except for the corrupted keys 1 80 2 3 4 5 6 7 8 9 10 k-weakly fault tolerant 3-unordered k-unordered k-unordered = ordered except for k (correct or corrupted) keys 1 80 2 3 4 9 5 7 8 6 10 faithfully ordered strongly fault tolerant ordered A hierarchy of disorder
Idea of our merging algorithm: k-unordered faithfully ordered ordered solve the k-weakly FT merging problem (for k not too large) and use it to reconstruct a faithful order Solving a relaxation
F Faithfully ordered stronglyFT-merge E Faithfully ordered, short naïf-mergesort Faithfully ordered, long S D Disordered, short purify Very fast k-unordered, but k is not so large C k-weaklyFT-merge Very fast A B The merging algorithm: a big picture Slow in general, fast on unbalanced sequences Slow, but D is short...
F Faithfully ordered stronglyFT-merge O(n+ad2) E Faithfully ordered, short naïf-mergesort O(ad2) Faithfully ordered, long S D |D| = O(ad) purify O(n+ad2) Running time O(n+a d2) C O(ad)-unordered, a ≤ d Strongly fault tolerant k-weaklyFT-merge O(n) A B The merging algorithm: a big picture
We obtain an O(n log n) strongly fault tolerant sorting algorithm that isresilientuptoO(n log n)1/3 memory faults and uses O(n) space Summing up By plugging in the merging algorithm into mergesort, we can sort in timeO(n log n+d3) and thus:
How many faults can we tolerate in the worst case maintaining space and running time optimal? Q2. No more than O((n log n)1/2) • If d n2/(3-2e), for some e [0,1/2], then W(n+ d2-e) comparisons are necessary for merging two faithfully ordered n-length lists A (polynomial) lower bound To prove this, we first prove a lower bound on fault tolerant merging: We use an adversarybased argument
Paul (the merging algorithm) asks comparison questions of the form “x<y?” Delphi’s Oracle Carole (the adversary) must answer consistently If Paul asks less than d2-e/2 questions, than he cannot determine the correct faithful order univocally Adversary-based argument: big picture Carole’s power: doesn’t need to choose the sequences in advance, can play with memory faults Carole’s limits: if challenged at any time, she must exhibit two input sequences and at most d memory faults that prove that her answers were consistent with the memory image
A and B: n-length faithfully ordered sequences n/d n/d n/d n/d A1 A2 ... Ad-1 Ad A n/d n/d n/d n/d B1 B2 ... Bd-1 Bd B Carole’s strategy Carole answers as if the sorted sequence were: A1 B1 A2 B2 … Bd-1 Ad Bd
S A set S of consecutive subsequences is sparse if the number of comparisons between elements in S is at mostd/2 A1 B1 A2 B2 A3 B3 … Bd-1 Ad Bd If d n2/(3-2e), for some e [0,1/2], and Paul asks less than d2-e/2 questions, then sparse set S containing two elements a Ai S and b Bj S that have not been directly compared by Paul Sparse sets We prove that both the order a<b and the order b<a can be consistent with Carole’s answers
x x • “x<y?” is a possibly “useful” • comparison if both x and y S ≤ b ≤ ≤ y y a • All possibly useful elements are corrupted! at most 2(d/2) = d faults Paul’s dilemma: a<b or b<a? a and b not directly compared: but why Paul can’t deduce their order by transitivity? For each such question asked by Paul, Carole asserts that she answered after corrupting x (if x ≠ a,b) and y (if y ≠ a,b) • How many memory faults introduced by Carole? • 1 or 2 faults only if x and y are both in S • S sparse (at most d/2 comparisons)
If d n2/(3-2e) ? W(n+ d2-e) comparisons for merging W(n log n+ d2-e) comparisons for sorting Implications • If d n2/(3-2e), for some e [0,1/2], then Paul must ask at least d2-e/2 questions to merge A and B
If d n2/(3-2e), W(n log n+ d2-e) comparisons for sorting 0 n d (n log n)6/11 (n log n)2/3 n3/4 n (n log n)1/2 n2/3 W(n log n+ d2) W(n log n+ d11/6) W(n log n+ d3/2) An O(n log n) strongly fault tolerant sorting algorithm cannot beresilienttow((n log n)1/2)memory faults What if d is large?
Open questions • Closing the gap: • our algorithm is resilient to O(n log n)1/3 memory faults • no optimal algorithm can be resilient tow((n log n)1/2)memory faults • Can randomization help (e.g., to tolerate more than • (n log n)1/2 memory faults)? • External fault-tolerant sorting