330 likes | 511 Views
Data Structure & Algorithm. Lecture 5 Heap Sort & Binary Tree JJCAO. Recitation. Bubble Sort : O(n^2) Insertion Sort : O(n^2) Selection Sort: O(n^2) Merge Sort: O( nlgn ). Importance of Sorting. Why don’t CS profs ever stop talking about sorting ?
E N D
Data Structure & Algorithm Lecture 5 Heap Sort & Binary Tree JJCAO
Recitation • Bubble Sort : O(n^2) • Insertion Sort : O(n^2) • Selection Sort: O(n^2) • Merge Sort: O(nlgn)
Importance of Sorting Why don’t CS profs ever stop talking about sorting? • Computers spend more time sorting than anything else, historically 25% on mainframes. • Sorting is the best studied problem in computer science, with a variety of different algorithms known. • Most of the interesting ideas we will encounter in the course can be taught in the context of sorting, such as divide-and-conquer, randomized algorithms, and lower bounds. You should have seen most of the algorithms - we will concentrate on the analysis.
Efficiency of Sorting • Sorting is important because that once a set of items is sorted, many other problems become easy. • Large-scale data processing would be impossible if sorting took Ω(n^2) time.
Heap Sort • Running time – roughly nlog(n) • like Merge Sort • unlike Insertion Sort • In place • like Insertion Sort • unlike Merge Sort • Uses a heap
Binary Tree depthof a node: # of edges on path to the root : a node whose subtrees are empty
Implementing Binary Trees Relationships (left). Finding the minimum (center) & maximum (right) elements
Complete Binary Trees • Heightof a node: Number of edges on longest path to a leaf • Height of a tree = height of its root • Lemma: A complete binary tree of height h has -1 nodes Proof: By induction on h h=0: leaf, 21-1=1 node h>0: Tree consists of two complete trees of height h-1 plus the root. Total: (2h-1) + (2h-1) +1 = 2h+1-1
Complete Binary Trees • A Binary Tree is completeif every internal node has exactly two children and all leaves are at the same depth:
Almost Complete Binary Trees An almost complete binary tree is a complete tree possibly missing some nodes on the right side of the bottom level:
(Binary) Heaps - ADT • An almost complete binary tree • each node contains a key • Keys satisfy the heap property: each node’s key >=its children’s keys
Max heap Min heap
Heapify Example Heapify(A,i) – fix Heap properties given a violation at position i
Heapify Heapifyon a node of height h takes roughly dh Steps Height of the tree is logn, so Heapify on the root node takes: dlogn steps.
Build-Heap (After BuildHeap – A[1] stores max element) • We have about n/2 calls to Heapify • Cost of <= dlogn - for each call to Heapify • TOTAL: d(n/2)logn • Butwe can do better and show a cost of cn to achieve a total running time linear in n.
Heap-Sort Running Time: at most dnlgnfor some d>0 // O(n) // O(n) // O(lgn)
Several Sort Algorithms http://www.sorting-algorithms.com
Heapsort • Heapsort is an excellent algorithm, but a good implementation of quicksort, usually beats it in practice. • Nevertheless, the heap data structure itself has many uses: • Priority queue (most popular)
Review - What is a Heap? • a almost complete tree-like structure • usually based on an array • fast access to the largest (or smallest) data item.
Priority Queue ADT Priority Queue – a set of elements S, each with a key Operations: • insert(S,x) - insert element x into S S <- S U {x} • max(S) - return element of S with largest key • extract-max(S) - remove and return element of S with largest key
Implementing PQs by Heaps Heap-Maximum(A) 1 if heap-size[A] >= 1 • return( A[1] ) => Running Time: constant
Heap Extract-Max Heap-Extract-Max(A) 1 if heap-size[A] < 1 2 error “heap underflow” 3 max <- A[1] 4 A[1] <- A[heap-size[A]] 5 heap-size[A] <-heap-size[A]-1 6 Heapify(A,1) 7 return max Running Time: dlgn + c = d’lgn when heap-size[A] = n
Heap-Insert Heap-Insert(A,key) 1 heap-size[A] <-heap-size[A]+1 2 i <-heap-size[A] 3 while i>0 and A[parent(i)]<key 4 A[i] <-A[parent(i)] 5 i <- parent(i) 6 A[i] <-key Running Time: dlgn when heap-size[A] = n
Priority Queue Sorting PQ-Sort(A) 1 S <- Φ 2 for i <-1 to n 3 Heap-Insert(S,A[i]) //O(lgn), O(lg(S.size)) 4 for i <-n downto1 // O(n) 5 SortedA[i] <-Extract-Max(S) //O(lgn) // O(n) // O(n) // O(lgn) Running Time: at most dnlgnfor some d>0
Comparison of Special-Purpose Structures Use max-priority queues to schedule jobs on a shared computer. It keeps track of the jobs to be performed and their relative priorities. When a job is finished or interrupted, the scheduler selects the highest-priority job from among those pending by calling EXTRACT-MAX. The scheduler can add a new job to the queue at any time by calling INSERT.
Homework 4 • Hw04-GuiQtScribble • Deadline: 22:00, Oct. ?, 2011