840 likes | 857 Views
Learn about Priority Queues, Binary Heaps, their properties, operations, and applications in data structures. Explore implementation details and optimizations for efficient data manipulation.
E N D
Recall Queues • FIFO: First-In, First-Out • Some contexts where this seems right? • Some contexts where some things should be allowed to skip ahead in the line?
Queues that Allow Line Jumping • Need a new ADT • Operations: Insert an Item, Remove the “Best” Item 6 2 15 23 12 18 45 3 7 insert deleteMin
Priority Queue ADT • PQueue data : collection of data with priority • PQueue operations • insert • deleteMin • PQueue property: for two elements in the queue, x and y, if x has a lowerpriority value than y, x will be deleted before y
Applications of the Priority Queue • Select print jobs in order of decreasing length • Forward packets on routers in order of urgency • Select most frequent symbols for compression • Sort numbers, picking minimum first • Anythinggreedy
Potential Implementations O(1)/O(N)worst-array full, should say WHY, might reject on full instead. O(N) – to find value O(1) O(N) – to find value O(log N) to find loc w. Bin search, but O(N) to move vals O(1) to find val, but O(N) to move vals, (or O(1) if in reverse order) O(N) to find loc, O(1) to do the insert O(1) O(N) O(N) Plus – good memory usage O(log N)close to O(1) 1.67 levels on average O(log N) Binary Heap
Recall From Lists, Queues, Stacks • Use an ADT that corresponds to your needs • The right ADT is efficient, while an overly general ADT provides functionality you aren’t using, but are paying for anyways • Heaps provide O(log n) worst case for both insert and deleteMin, O(1) average insert
Binary Heap Properties • Structure Property • Ordering Property
Tree Review Tree T A root(T): leaves(T): children(B): parent(H): siblings(E): ancestors(F): descendents(G): subtree(C): A DEFJ..NI B C D E F G H I Its parent or parent’s ancestor Its child or child’s descendent J K L M N Itself plus all descendents
# edges More Tree Terminology Tree T A depth(B): height(G): degree(B): branching factor(T): # edges on path from root to self # edges on longest path from node to leaf B C # children of a node D E F G MAX # children of a nodefor a BST this is?? H I J K L M N
skip Brief interlude: Some Definitions: A Perfect binary tree – A binary tree with all leaf nodes at the same depth. All internal nodes have 2 children. height h 2h+1 – 1 nodes 2h – 1 non-leaves 2h leaves 11 5 21 16 25 2 9 1 3 7 10 13 19 22 30
Heap Structure Property • A binary heap is a complete binary tree. Complete binary tree – binary tree that is completely filled, with the possible exception of the bottom level, which is filled left to right. Examples: Since they have this regular structure property, we can take advantage of that to store them in a compact manner.
1 2 3 6 4 5 8 9 10 11 12 Representing Complete Binary Trees in an Array A From node i: left child: right child: parent: B C 2 * i 7 F G D E (2 * i)+1 H I J K L └ i / 2┘ implicit (array) implementation:
Why this approach to storage? • Uses less space (only values, no pointers) • *2 and /2,+ are usually faster operations than dereferencing a pointer • An array MAY get better locality in general than a linked structure. • Can get to the parent easily (and without an extra pointer)
Heap Order Property Heap order property: For every non-root node X, the value in the parent of X is less than (or equal to) the value in X. This is the order for a MIN heap – could do the same for a max heap. 10 10 20 80 20 80 40 60 85 99 30 15 50 700 not a heap This is a PARTIAL order (diff than BST) For each node, its value is less than all of its descendants (no distinction between left and right)
Heap Operations How? Is the tree unique? Swap 85 and 99. Swap 700 and 85? • findMin: • insert(val): percolate up. • deleteMin: percolate down. 10 20 80 40 60 85 99 50 700 65
Heap – Insert(val) Basic Idea: • Put val at “next” leaf position • Percolate up by repeatedly exchanging node until no longer needed How long does this take? How long does this take? – max # of exchanges = O(log N) On “average” only need to move up 1.67 levels so get O(1)
Insert: percolate up 10 Now insert 90. (no swaps, even though 99 is larger!) Now insert 7. 20 80 40 60 85 99 50 700 65 15 Optimization, bubble up an empty space to reduce # of swaps 10 15 80 40 20 85 99 50 700 65 60
Insert Code (optimized) int percolateUp(int hole, Object val) { while (hole > 1 && val < Heap[hole/2]) Heap[hole] = Heap[hole/2]; hole /= 2; } return hole; } void insert(Object o) { assert(!isFull()); size++; newPos = percolateUp(size,o); Heap[newPos] = o; } Θ(log n) worst case Θ(1) on average: only have to go up a few levels runtime: (Code in book)
Heap – Deletemin Basic Idea: • Remove root (that is always the min!) • Put “last” leaf node at root • Find smallest child of node • Swap node with its smallest child if needed. • Repeat steps 3 & 4 until no swaps needed. Max # of exchanges? = O(log N), there is a good chance goes to bottom
DeleteMin: percolate down Max # of exchanges? = O(log N), There is a good chance goes to bottom (started at bottom) vs. insert - Could also use the percolate empty bubble down 10 20 15 40 60 85 99 50 700 65 15 20 65 40 60 85 99 50 700
DeleteMin Code (Optimized) int percolateDown(int hole, Object val) { while (2*hole <= size) { left = 2*hole; right = left + 1; if (right ≤ size && Heap[right] < Heap[left]) target = right; else target = left; if (Heap[target] < val) { Heap[hole] = Heap[target]; hole = target; } else break; } return hole; } Object deleteMin() { assert(!isEmpty()); returnVal = Heap[1]; size--; newPos = percolateDown(1, Heap[size+1]); Heap[newPos] = Heap[size + 1]; return returnVal; } Θ(log n) worst/ave case runtime: (code in book)
12 5 11 3 10 6 9 4 8 1 7 2 Building a Heap 10 11 12 0 1 2 3 Red nodes need to percolate down
Building a Heap • Adding the items one at a time is O(n log n) in the worst case • I promised O(n) for today
Working on Heaps • What are the two properties of a heap? • Structure Property • Order Property • How do we work on heaps? • Fix the structure • Fix the order
12 5 11 3 10 6 9 4 8 1 7 2 BuildHeap: Floyd’s Method 10 11 12 0 1 2 3 Add elements arbitrarily to form a complete tree. Pretend it’s a heap and fix the heap-order property! 12 Red nodes need to percolate down 5 11 3 10 6 9 4 8 1 7 2
Buildheap pseudocode private void buildHeap() { for ( int i = currentSize/2; i > 0; i-- ) percolateDown( i ); } runtime:
BuildHeap: Floyd’s Method 10 11 12 0 1 2 3 12 5 11 Red nodes need to percolate down 3 10 6 9 4 8 1 7 2
BuildHeap: Floyd’s Method 12 5 11 3 10 2 9 4 8 1 7 6
BuildHeap: Floyd’s Method 12 12 5 11 5 11 3 10 2 9 3 1 2 9 4 8 1 7 6 4 8 10 7 6
BuildHeap: Floyd’s Method 12 12 5 11 5 11 3 10 2 9 3 1 2 9 4 8 1 7 6 4 8 10 7 6 12 5 2 3 1 6 9 4 8 10 7 11
BuildHeap: Floyd’s Method 12 12 5 11 5 11 3 10 2 9 3 1 2 9 4 8 1 7 6 4 8 10 7 6 12 12 5 2 1 2 3 1 6 9 3 5 6 9 4 8 10 7 11 4 8 10 7 11
- Runtime bounded by sum of heights of nodes, which is linear. O(n) - How many nodes at height 1, and height 2, up to root, - See text, Thm. 6.1 p. 194 for detailed proof. Finally… 1 3 2 4 5 6 9 12 8 10 7 11 runtime:
More Priority Queue Operations • decreaseKey • given a pointer to an object in the queue, reduce its priority value Solution: change priority and ____________________________ • increaseKey • given a pointer to an object in the queue, increase its priority value Solution: change priority and _____________________________ percolateUp percolateDown Why do we need a pointer? Why not simply data value? It’s hard to find in a pQ
More Priority Queue Operations • Remove(objPtr) • given a pointer to an object in the queue, remove the object from the queue Solution: set priority to negative infinity, percolate up to root and deleteMin • FindMax Call insert n times Θ(n log n) worst, Θ(n) average
Facts about Heaps Observations: • Finding a child/parent index is a multiply/divide by two • Operations jump widely through the heap • Each percolate step looks at only two new nodes • Inserts are at least as common as deleteMins Realities: • Division/multiplication by powers of two are equally fast • Looking at only two new pieces of data: bad for cache! • With huge data sets, disk accesses dominate At library – writing a research paper
Cycles to access: CPU Cache Your desk/table at library Go to the stacks, bring back armful of books Memory Go to front desk, borrow a cart, take elevator, go stacks, bring back cartload of books Disk
How does height compare to bin heap? (less) A Solution: d-Heaps 1 • Each node has d children • Still representable by array • Good choices for d: • (choose a power of two for efficiency) • fit one set of children in a cache line • fit one set of children on a memory page/disk block 3 7 2 4 8 5 12 11 10 6 9 12 1 3 7 2 4 8 5 12 11 10 6 9
One More Operation • Merge two heaps • Add the items from one into another? • O(n log n) • Start over and build it from scratch? • O(n) Can do in Θ(log n) worst case time. Next lecture!
CSE 326: Data StructuresPriority Queues Leftist Heaps & Skew Heaps
In fact if we want to do better than O(n) need to give up array rep. (copying elements) – use ptrs (all DS w. efficient merge use ptrs (we expect this will make the other ops slightly less efficient) New Heap Operation: Merge Given two heaps, merge them into one heap • first attempt: insert each element of the smaller heap into the larger. runtime: • second attempt: concatenate binary heaps’ arrays and run buildHeap. runtime: Θ(n log n) worst Θ(n) average Θ(n) worst How about O(log n) time?
Leftist Heaps Idea: Focus all heap maintenance work in one small part of the heap Leftist heaps: • Most nodes are on the left • All the merging work is done on the right Actually TRIES to be unbalanced.
Definition: Null Path Length null path length (npl) of a node x = the number of nodes between x and a null in its subtree OR npl(x) = min distance to a descendant with 0 or 1 children • npl(null) = -1 • npl(leaf) = 0 • npl(single-child node) = 0 2 ? 1 1 ? ? 0 0 1 ? 0 • Equivalent definitions: • npl(x) is the height of largest complete subtree rooted at x • npl(x) = 1 + min{npl(left(x)), npl(right(x))} 0 0 0
Leftist Heap Properties • Heap-order property • parent’s priority value is to childrens’ priority values • result: minimum element is at the root • Leftist property • For every node x, npl(left(x)) npl(right(x)) • result: tree is at least as “heavy” on the left as the right No, no
Are These Leftist? NO Yes Yes 2 2 0 1 1 1 1 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Every subtree of a leftist tree is leftist! 0 0
Right Path in a Leftist Tree is Short (#1) Claim: The right path is as short as any in the tree. Proof: (By contradiction) Pick a shorter path: D1 < D2 Say it diverges from right path at x npl(L) D1-1 because of the path of length D1-1 to null npl(R) D2-1 because every node on right path is leftist x L R D1 D2 Leftist property at x violated!
Right Path in a Leftist Tree is Short (#2) Claim: If the right path has r nodes, then the tree has at least2r-1 nodes. Proof: (By induction) Base case : r=1. Tree has at least 21-1 = 1node Inductive step : assume true for r’< r. Prove for tree with right path at least r. 1. Right subtree: right path of r-1 nodes2r-1-1 right subtree nodes (by induction)2. Left subtree: also right path of length at least r-1 (by previous slide) 2r-1-1 left subtree nodes (by induction) Total tree size: (2r-1-1) + (2r-1-1) + 1 = 2r-1
Why do we have the leftist property? Because it guarantees that: • the right path is really short compared to the number of nodes in the tree • A leftist tree of N nodes, has a right path of at most lg (N+1) nodes Idea – perform all work on the right path