1 / 18

Heapsort Algorithm

Heapsort Algorithm. CS 583 Analysis of Algorithms. Outline. Sorting Problem Heaps Definition Maintaining heap property Building a heap Heapsort Algorithm. Sorting Problem. Sorting is usually performed not on isolated data, but records .

Download Presentation

Heapsort Algorithm

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Heapsort Algorithm CS 583 Analysis of Algorithms CS583 Fall'06: Heapsort

  2. Outline • Sorting Problem • Heaps • Definition • Maintaining heap property • Building a heap • Heapsort Algorithm CS583 Fall'06: Heapsort

  3. Sorting Problem • Sorting is usually performed not on isolated data, but records. • Each record contains a key, which is the value to be sorted. • The remainder is called satellite data. • When a sorting algorithm permutes the keys, it must permute the satellite data as well. • If the satellite data is large for each record, we often permute pointers to records. • This level of detail is usually irrelevant in the study of algorithms, but is important when converting an algorithm to a program. CS583 Fall'06: Heapsort

  4. Sorting Problem: Importance • Sorting is arguably the most fundamental problem in the study of algorithms for the following reasons: • The need to sort information is often a key part of an application. For example, sorting the financial reports by security IDs. • Algorithms often use sorting as a key subroutine. For example, in order to match a security against benchmarks, the latter set needs to be sorted by some key elements. • There is a wide variety of sorting algorithms, and they use a rich set of techniques. CS583 Fall'06: Heapsort

  5. Heaps • Heapsort algorithm sorts in place and its running time is O(n log(n)). • It combines the better attributes of insertion sort and merge sort algorithms. • It is based on a data structure, -- heaps. • The (binary) heap data structure is an array object that can be viewed as a nearly complete binary tree. • An array A that represents a heap is an object with two attributes: • length[A], which is the number of elements, and • heap-size[A], the number of elements in the heap stored within the array A. CS583 Fall'06: Heapsort

  6. Heaps: Example • A = {10, 8, 6, 5, 7, 3, 2} • 10 • 8 6 • 7 3 2 • The root of the tree is A[1]. Children of a node i determined as follows: • Left(i) • return 2i • Right(i) • return 2i+1 CS583 Fall'06: Heapsort

  7. Heaps: Example (cont.) • The above is proven by induction: • The root's left child is 2 = 2*1. • Assume it is true for node n. • The left child of a node (n+1) will follow the right child of node n: left(n+1) = 2*n + 1 + 1 = 2(n+1) • The parent of a node i is calculated from i=2p, or i=2p+1, where p is a parent node. Hence • Parent(i) • returnfloor(i/2) CS583 Fall'06: Heapsort

  8. Max-Heaps • In a max-heap, for every node i other than the root:  • A[Parent(i)] >= A[i] • For the heapsort algorithm, we use max-heaps. • The height of the heap is defined to be the longest path from the root to a leaf, and it is (lg n) since it is a complete binary tree. • We will consider the following basic procedures on the heap:  • Max-Heapify to maintain the max-heap property. • Build-Max-Heap to produce a max-heap from an unordered input array. • Heapsort to sort an array in place. CS583 Fall'06: Heapsort

  9. Maintaining the Heap Property • The Max-Heapify procedure takes an array A and its index i. • It is assumed that left and right subtrees are already max-heaps. • The procedure lets the value of A[i] "float down" in the max-heap so that the subtree rooted at index i becomes a max-heap. CS583 Fall'06: Heapsort

  10. Max-Heapify (A, i) 1 l = Left(i) 2 r = Right(i) 3 if l <= heap-size[A] and A[l] > A[i] 4 largest = l 5 else 6 largest = i 7 if r <= heap-size[A] and A[r] > A[largest] 8 largest = r 9 if largest <> i 10 <exchange A[i] with A[largest]> 11 Max-Heapify(A, largest) Max-Heapify: Algorithm CS583 Fall'06: Heapsort

  11. It takes (1) to find A[largest], plus the time to run the procedure recursively on at most 2n/3 elements. (This is the maximum size of a child tree. It occurs when the last row of the tree is exactly half full.) Assume there n nodes and x levels in the tree that has half of the last row. This means: n = 1 + 2 + ... + 2^(x-1) + 2^x/2 2^x – 1 + 2^x/2 = n 2^(x-1) = a => 2a + a = n+1 => 2^(x-1) = (n+1)/3 Max-Heapify: Analysis CS583 Fall'06: Heapsort

  12. Max-Heapify: Analysis (cont.) Max subtree size = (half of all elements to level x-1) + (elements at the last level) – (1 root element) = (2^x – 1)/2 + 2^x/2 – 1 = 2^(x-1) – ½ + 2^(x-1) – 1 = n/3 + 1/3 + n/3 + 1/3 – 1.5 = 2n/3 + 2/3 – 1.5 ~ 2n/3 Therefore the running time of Max-Heapify is described by the following recurrence: T(n) <= T(2n/3) + (1) According to the master theorem: T(n) = (lg n) (a=1, b=3/2, f(n) = (1)) Since T(n) is the worst-case scenario, we have a running time of the algorithm at O(lg n). CS583 Fall'06: Heapsort

  13. Building a Heap • We can use the procedure Max-Heapify in a bottom-up manner to convert the whole array A[1..n] into a max-heap. • Note that, elements A[floor(n/2)+1..n] are leaves. The last element that is not a leaf is a parent of the last node, -- floor(n/2). • The procedure Build-Max-Heap goes through all non-leaf nodes and runs Max-Heapify on each of them. CS583 Fall'06: Heapsort

  14. Build-Max-Heap: Algorithm • Build-Max-Heap(A, n) • 1 heap-size[A] = n • 2 for i = floor(n/2) to 1 • 3 Max-Heapify(A,i) • Invariant: • At the start of each iteration 2-3, each node i+1, ... , n is the root of a max-heap. • Proof. • Initialization: i=floor(n/2). Each node in floor(n/2)+1,...,n are leaves and hence are roots of trivial max-heaps. CS583 Fall'06: Heapsort

  15. Build-Max-Heap: Correctness • Maintenance: children of node i are numbered higher than i, and by the loop invariant are assumed to be roots of max-heaps. • This is the condition for Max-Heapify. • Moreover, the Max-Heapify preserves the property that i+1, ... , n are roots of max-heaps. • Decrementing i by 1 makes the loop invariant for the next iteration. • Termination: i=0, hence each node 1,2,...,n is the root of a max-heap. CS583 Fall'06: Heapsort

  16. Build-Max-Heap: Performance • Each call to Max-Heapify takes O(lg n) time and there are n such calls. • Therefore the running time of Build-Max-Heap is O(n lgn). • To derive a tighter bound, we observe that the running time of Max-Heapify depends on the node's height. • An n-element heap has height floor(lgn). There are at most ceil(n/2^(h+1)) nodes of any height h. Assume these nodes are at height x of the original tree. Then we have: CS583 Fall'06: Heapsort

  17. Build-Max-Heap: Performance (cont.) 1+2+...+2^x+...+2^h = n 2^(x+h+1) = n+1 2^x = (n+1)/2^(h+1) = ceil(n/2^(h+1)) The time required by Max-Heapify when called on a node of height h is O(h). Hence: h=0,floor(lgn)ceil(n/2^(h+1)) O(h) = O(nh=0,floor(lgn)h/2^h) A.8: k=0,k/x^k = x/(1-x)^2 h=0,h/2^h = ½ / (1-1/2)^2 = 2 Thus, the running time of Build-Max-Heap can be bounded O(n h=0,floor(lgn)h/2^h) = O(nh=0,h/2^h) = O(n) CS583 Fall'06: Heapsort

  18. The Heapsort Algorithm The heapsort algorithm uses Build-Max-Heap on A[1..n]. Since the maximum element of the array is at A[1], it can be put into correct position A[n]. Now A[1..(n-1)] can be made max-heap again. Heapsort (A,n) 1 Build-Max-Heap(A,n) 2 for i = n to 2 3 <swap A[1] with A[i]> 4 heap-size[A] = heap-size[A]-1 5 Max-Heapify(A,1) Step 1 takes O(n) time. Loop 2 is repeated (n-1) times with step 5 taking most time O(lgn). Hence the running time of heapsort is O(n) + O(n lgn) = O(n lgn). CS583 Fall'06: Heapsort

More Related