930 likes | 1.13k Views
Chapter 11. Merge-sort Quick-sort A lower bound on sort Bucket-sort and Radix sort Sets and Unions Selection. Topics. 7 2 9 4 2 4 7 9. 7 2 2 7. 9 4 4 9. 7 7. 2 2. 9 9. 4 4. Merge Sort. Divide-and-Conquer.
E N D
Merge-sort Quick-sort A lower bound on sort Bucket-sort and Radix sort Sets and Unions Selection Topics
7 2 9 4 2 4 7 9 7 2 2 7 9 4 4 9 7 7 2 2 9 9 4 4 Merge Sort
Divide-and-Conquer • Divide-and conquer is a general algorithm design paradigm: • Divide: If the input size is smaller than a certain threshold, solve the problem directly using a straightforward method and return the solution obtained, otherwise divide the input into two or more disjoint subsets • Recur: Recursively solve the sub-problems associated with the subsets • Conquer: Take the solution to the sub-problems and merge then into a solution to the original problem
Merge-Sort • Merge-sort is a sorting algorithm based on the divide-and-conquer paradigm • Like heap-sort • It uses a comparator • It has O(n log n) running time • Unlike heap-sort • It does not use an auxiliary priority queue • It accesses data in a sequential manner (suitable to sort data on a disk)
Using Divide-and-Conquer for Sorting • A merge-sort on an input sequence S with n elements using a comparator consists of three steps based on the divide-and-conquer paradigm: • Divide: partition S into two sequences S1and S2 of about n/2 elements each • S1 contains the first elements and S2 contains the remaining elements • Recur: recursively sort S1and S2 • Conquer: merge S1and S2 into a sorted sequence
Merge-Sort Algorithm AlgorithmmergeSort(S, C) Input sequence S with n elements, comparator C Output sequence S sorted • according to C ifS.size() > 1 (S1, S2) partition(S, n/2) mergeSort(S1, C) mergeSort(S2, C) S merge(S1, S2)
Merging Two Sorted Sequences • The conquer step of merge-sort consists of merging two sorted sequences A and B into a sorted sequence S containing the union of the elements of A and B • Merging two sorted sequences, each with n/2 elements and implemented by means of a doubly linked list, takes O(n) time Merge Sort
Merging Two Sorted Sequences Algorithmmerge(A, B) Input sequences A and B withn/2 elements each Output sorted sequence of A B S empty sequence whileA.empty() B.empty() ifA.front() < B.front() S.addBack(A.front()); A.eraseFront(); else S.addBack(B.front()); B.eraseFront(); whileA.empty() S.addBack(A.front()); A.eraseFront(); whileB.empty() S.addBack(B.front()); B.eraseFront(); return S Merge Sort
Merge-Sort Tree • Anexecution of merge-sort is depicted by a binary tree • Each node represents a recursive call of merge-sort algorithm • Each node v of T is processed by the call associated with v • The children of node v are associated with the recursive call that process the subsequences S1 and S2 • Unsorted sequence before the execution • Sorted sequence at the end of the execution • The external nodes are associated with the individual elements of S • The root is the initial call 7 2 9 4 2 4 7 9 7 2 2 7 9 4 4 9 7 7 2 2 9 9 4 4
7 2 9 4 2 4 7 9 3 8 6 1 1 3 8 6 7 2 2 7 9 4 4 9 3 8 3 8 6 1 1 6 7 7 2 2 9 9 4 4 3 3 8 8 6 6 1 1 Execution Example (1) • Partition 7 2 9 4 3 8 6 11 2 3 4 6 7 8 9
7 2 2 7 9 4 4 9 3 8 3 8 6 1 1 6 7 7 2 2 9 9 4 4 3 3 8 8 6 6 1 1 Execution Example (2) • Recursive call, partition 7 2 9 4 3 8 6 11 2 3 4 6 7 8 9 7 2 9 4 2 4 7 9 3 8 6 1 1 3 8 6
7 7 2 2 9 9 4 4 3 3 8 8 6 6 1 1 Execution Example (3) • Recursive call, partition 7 2 9 4 3 8 6 11 2 3 4 6 7 8 9 7 2 9 4 2 4 7 9 3 8 6 1 1 3 8 6 7 2 2 7 9 4 4 9 3 8 3 8 6 1 1 6
7 2 2 7 9 4 4 9 3 8 3 8 6 1 1 6 Execution Example (4) • Recursive call, base case 7 2 9 4 3 8 6 11 2 3 4 6 7 8 9 7 2 9 4 2 4 7 9 3 8 6 1 1 3 8 6 77 2 2 9 9 4 4 3 3 8 8 6 6 1 1
Execution Example (5) • Recursive call, base case 7 2 9 4 3 8 6 11 2 3 4 6 7 8 9 7 2 9 4 2 4 7 9 3 8 6 1 1 3 8 6 7 2 2 7 9 4 4 9 3 8 3 8 6 1 1 6 77 22 9 9 4 4 3 3 8 8 6 6 1 1
Execution Example (6) • Merge 7 2 9 4 3 8 6 11 2 3 4 6 7 8 9 7 2 9 4 2 4 7 9 3 8 6 1 1 3 8 6 7 22 7 9 4 4 9 3 8 3 8 6 1 1 6 77 22 9 9 4 4 3 3 8 8 6 6 1 1
Execution Example (7) • Recursive call, …, base case, merge 7 2 9 4 3 8 6 11 2 3 4 6 7 8 9 7 2 9 4 2 4 7 9 3 8 6 1 1 3 8 6 7 22 7 9 4 4 9 3 8 3 8 6 1 1 6 77 22 9 9 4 4 3 3 8 8 6 6 1 1
Execution Example (8) • Merge 7 2 9 4 3 8 6 11 2 3 4 6 7 8 9 7 2 9 42 4 7 9 3 8 6 1 1 3 8 6 7 22 7 9 4 4 9 3 8 3 8 6 1 1 6 77 22 9 9 4 4 3 3 8 8 6 6 1 1
Execution Example (9) • Recursive call, …, merge, merge 7 2 9 4 3 8 6 11 2 3 4 6 7 8 9 7 2 9 42 4 7 9 3 8 6 1 1 3 6 8 7 22 7 9 4 4 9 3 8 3 8 6 1 1 6 77 22 9 9 4 4 33 88 66 11
Execution Example (10) • Merge 7 2 9 4 3 8 6 11 2 3 4 6 7 8 9 7 2 9 42 4 7 9 3 8 6 1 1 3 6 8 7 22 7 9 4 4 9 3 8 3 8 6 1 1 6 77 22 9 9 4 4 33 88 66 11
Analysis of Merge-Sort • The height h of the merge-sort tree is O(log n) • at each recursive call we divide in half the sequence • The overall amount or work done at the nodes of depth i is O(n) • One partitions and merges 2i sequences of size n/2i • One makes 2i+1 recursive calls • The total running time of merge-sort is O(n log n)
Summary of Sorting Algorithms Chapter 8 Chapter 8
7 4 9 6 2 2 4 6 7 9 4 2 2 4 7 9 7 9 2 2 9 9 Quick-Sort
Quick-Sort • Quick-sort is based on the divide-and-conquer paradigm where most of the work is done before the recursive calls • Divide: pick a random element x (called pivot) and partition S into (common practice is to choose the last element of S) • L storeselements less than x • E storeselements equal x • G storeselements greater than x • Recur: Recursively sort L and G • Conquer: Put back the elements into S by concatenating the elements of L, followed by elements of E,then followed by the elements of G
Quick-Sort x x L G E x
Partition • We partition an input sequence as follows: • We remove, in turn, each element y from S and • We insert y into L, Eor G,depending on the result of the comparison with the pivot x • Each insertion and removal is at the beginning or at the end of a sequence, and hence takes O(1) time • Thus, the partition step of quick-sort takes O(n) time Quick-Sort
Partition Algorithmpartition(S,p) Input sequence S, position p of pivot Output subsequences L,E, G of the elements of S less than, equal to, or greater than the pivot, resp. L,E, G empty sequences x S.erase(p) whileS.empty() y S.eraseFront() ify < x L.insertBack(y) else if y = x E.insertBack(y) else { y > x } G.insertBack(y) return L,E, G Quick-Sort
Quick-Sort Tree • An execution of quick-sort is depicted by a binary tree • Each node represents a recursive call of quick-sort and stores • Unsorted sequence before the execution and its pivot • Sorted sequence at the end of the execution • The root is the initial call 7 4 9 6 2 2 4 6 7 9 4 2 2 4 7 9 7 9 2 2 9 9
Execution Example (1) • Pivot selection 7 2 9 4 3 7 6 11 2 3 4 6 7 8 9 7 2 9 4 2 4 7 9 3 8 6 1 1 3 8 6 9 4 4 9 3 3 8 8 2 2 9 9 4 4
Execution Example (2) • Partition, recursive call, pivot selection 7 2 9 4 3 7 6 11 2 3 4 6 7 8 9 2 4 3 1 2 4 7 9 3 8 6 1 1 3 8 6 9 4 4 9 3 3 8 8 2 2 9 9 4 4
Execution Example (3) • Partition, recursive call, base case 7 2 9 4 3 7 6 11 2 3 4 6 7 8 9 2 4 3 1 2 4 7 3 8 6 1 1 3 8 6 11 9 4 4 9 3 3 8 8 9 9 4 4
Execution Example (4) • Recursive call, …, base case, join 7 2 9 4 3 7 6 11 2 3 4 6 7 8 9 2 4 3 1 1 2 3 4 3 8 6 1 1 3 8 6 11 4 334 3 3 8 8 9 9 44
Execution Example (5) • Recursive call, pivot selection 7 2 9 4 3 7 6 11 2 3 4 6 7 8 9 2 4 3 1 1 2 3 4 7 9 7 1 1 3 8 6 11 4 334 8 8 9 9 9 9 44
Execution Example (6) • Partition, …, recursive call, base case 7 2 9 4 3 7 6 11 2 3 4 6 7 8 9 2 4 3 1 1 2 3 4 7 9 7 1 1 3 8 6 11 4 334 8 8 99 9 9 44
Execution Example (7) • Join, join 7 2 9 4 3 7 6 1 1 2 3 4 67 7 9 2 4 3 1 1 2 3 4 7 9 7 1779 11 4 334 8 8 99 9 9 44
Worst-case Running Time • The worst case for quick-sort occurs when the pivot is the unique minimum or maximum element • One of L and G has size n - 1 and the other has size 0 • The running time is proportional to the sum n+ (n- 1) + … + 2 + 1 • Thus, the worst-case running time of quick-sort is O(n2) …
Consider a recursive call of quick-sort on a sequence of size s Good call: the sizes of L and G are each less than 3s/4 Bad call: one of L and G has size greater than 3s/4 A call is good with probability 1/2 1/2 of the possible pivots cause good calls: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Expected Running Time 7 2 9 4 3 7 6 1 9 7 2 9 4 3 7 6 1 1 7 2 9 4 3 7 6 2 4 3 1 7 9 7 1 1 Good call Bad call Bad pivots Good pivots Bad pivots
In-Place Quick-Sort • Quick-sort can be implemented to run in-place • In the partition step, we use replace operations to rearrange the elements of the input sequence such that • The elements less than the pivot have rank less than h • The elements equal to the pivot have rank between h and k • The elements greater than the pivot have rank greater than k • The recursive calls consider • elements with rank less than h • elements with rank greater than k
In-Place Quick-Sort Alogorithm AlgorithminPlaceQuickSort(S,l,r) Input sequence S, ranks l and r Output sequence S with the elements of rank between l and rrearranged in increasing order ifl r return i a random integer between l and r x S.elemAtRank(i) (h,k) inPlacePartition(x) inPlaceQuickSort(S,l,h - 1) inPlaceQuickSort(S,k + 1,r) Quick-Sort
In-Place Partitioning • Perform the partition using two indices to split S into L and E U G (a similar method can split E U G into E and G). • Repeat until j and k cross: • Scan j to the right until finding an element > x. • Scan k to the left until finding an element < x. • Swap elements at indices j and k j k (pivot = 6) 3 2 5 1 0 7 3 5 9 2 7 9 8 9 7 6 9 j k 3 2 5 1 0 7 3 5 9 2 7 9 8 9 7 6 9 Quick-Sort
Summary of Sorting Algorithms Quick-Sort
Many sorting algorithms are comparison based They sort by making comparisons between pairs of objects Examples: selection-sort, insertion-sort, heap-sort, merge-sort, quick-sort, ... One can derive a lower bound on the running time of any algorithm that uses comparisons to sort n elements, x1, x2, …, xn. Comparison-Based Sorting Is xi < xj? no yes
One needs to count comparisons Each internal node corresponds to a comparison Each possible run of the algorithm corresponds to a root-to-external node in decision tree Counting Comparisons
Decision Tree Height • The height of this decision tree is a lower bound on the running time • Every possible input permutation must lead to a separate external node • If not, some input …4…5… would have same output ordering as …5…4…, which would be wrong. • Since there are n!=1*2*…*n leaves, the height is at least log (n!)
The Lower Bound • Any comparison-based sorting algorithms takes at least log (n!) time • Therefore, any such algorithm takes time at least • That is, any comparison-based sorting algorithm must run in Ω(n log n) time (asymptotically one function is greater than or equal to another)
Bucket-Sort and Radix-Sort 1, c 3, a 3, b 7, d 7, g 7, e 0 1 2 3 4 5 6 7 8 9 B
Bucket-Sort • Let be S be a sequence of n (key, element) entries with keys in the range [0, N - 1] • Bucket-sort uses the keys as indices into an a bucket array that has cells indexed from 0 to N – 1 • Not based on comparisons • Phase 1: Each entry with k is placed into the bucket B[k] which is a sequence of entries with key k • Phase 2: After inserting each entry of sequence S into a B[k], the entries are moved back into S in sorted order by enumerating the contents of the buckets, B[i] where i=0,..,N-1 in order
Bucket-Sort Algorithm AlgorithmbucketSort(S,N) Inputsequence S of (key, element items with keys in the range [0, N - 1] Outputsequence S sorted by non-decreasing keys B array of N empty sequences For each entry e is S do ke.getkey() Remove e from S and insert at of the sequence of B[k] for i 0 to N -1 do for each entry e in sequence B[i] do remove e from B[i] and insert at the end of S