530 likes | 687 Views
CS 225 Data Structures & Software Principles. Section 8 Priority Queues and Disjoint Sets. Discussion Topics . Priority Queues Operations on Priority Queues Insert, FindMin, DeleteMin Heap Representation More about heaps (BuildHeap, HeapSort) Disjoint Sets Operations on Disjoint Sets
E N D
CS 225Data Structures & Software Principles Section 8 Priority Queues and Disjoint Sets
Discussion Topics • Priority Queues • Operations on Priority Queues • Insert, FindMin, DeleteMin • Heap Representation • More about heaps (BuildHeap, HeapSort) • Disjoint Sets • Operations on Disjoint Sets • Up Tree Representation • Practice Problems for Exam II
Priority Queues • Definition: A Priority Queue is a set ADT of such pairs (K,I) supporting the following operations where K key, a set of linearly ordered key values and I is associated with some information of type Element. • MakeEmptySet( ) • IsEmptySet(S) • Insert(K,I,S): Add pair (K,I) to set S • FindMin(S): Return an element I such that (K,I) S and K is minimal with respect to the ordering • DeleteMin(S): Delete an element (K,I) from S such that K is minimal and return I
Priority Queues • Elements have an intrinsic priority • Insert and Remove them in order of their priority, independent of the insertion sequence • Item with lowest key value has highest priority • Priority Queues are structures that are optimized for finding the minimal (= highest priority) element in the tree using FindMin( )
Priority Queues • Partially ordered tree • A Key tree of elements such that the priority of each node is greater than or equal to that of each of its children • Highest priority element of the tree is at the root • Must reorder elements when the highest priority element is deleted or when a new element is inserted
5 8 9 16 12 10 56 18 20 Priority Queues • A node’s key value is <= the key value of its children • No conclusion can be drawn about the relative order of the items in the left and right sub-trees of a node
Priority Queues • Priority Queues can be implemented • using balanced trees • as a heap • A double-ended Priority Queue is one which supports FindMax and DeleteMax operations along with FindMin and DeleteMin operations
Heaps • An implicitly represented complete partially ordered Key tree • Efficient structure for implementing a priority queue • Run-time Analysis • FindMin(S): O(1) • Insert(K,I,S): O(log n) worst-case • DeleteMin(S):O(log n) • Operations on heaps to be discussed • Insert • DeleteMin
5 8 9 16 12 10 56 18 20 Heaps: Array Implementation • Heaps are implemented using arrays • implicit representation of the complete partial order tree • Getting around • LeftChild(i) = 2 i • RightChild(i) = 2 i + 1 • Parent(i) = i/2 5 8 9 16 12 10 56 18 20 1 2 3 4 … 9
5 8 9 16 12 10 56 18 20 Heaps: Optimization • heap[0] = Sentinel • Sentinel is a key (priority) value less than any value allowed in the heap • Now swapping with parent at the root node is handled without any special consideration Sentinel value -1 5 8 9 16 12 10 56 18 20 0 1 2 3 4 … 9
Heap declaration heap.h template <class Etype> class MinHeap { private: int numElements; // number of elements in heap Array<Etype> theHeap; // array used to implement//heap public: MinHeap(); MinHeap(Array<Etype> const & theData); int IsEmpty() const { return Size==0; } void insert(const Etype& X); Etype deleteMin(); Etype findMin() const; };
Heaps: Insert • Step 1: Append the new element in its natural position as a new leaf node in the tree and call that node x • Step 2: if( x’s parent’s key > x’s key ) • Swap contents of x and x’s parent and reassign x to x’s parent • Repeat from Step 2 else • STOP
5 8 9 16 12 10 56 20 18 22 44 13 Heaps: Insert • Given the tree
5 8 9 16 12 10 56 20 18 22 44 13 Heaps: Insert • Insert node with key 7 7
Heaps: Insert 5 5 8 9 8 7 16 12 7 56 16 12 9 56 20 18 22 44 13 10 20 18 22 44 13 10
Insert – swapping values heap.cpp template <typename Etype> void MinHeap<Etype>::insert(Etype const & insElem) { // increase capacity, if necessary if (theHeap.size() == numElements) theHeap.setBounds(1, 2 * theHeap.size()); numElements = numElements + 1; // make a new "bubble" at next cell // to the right int i = numElements; while ((i != 1) && (theHeap[i / 2] > insElem)) { theHeap[i] = theHeap[i / 2]; // swap values of parent and child theHeap[i/2] = insElement; i = i / 2; } // once we exit loop, the "bubble" is where the new element goes theHeap[i] = insElem; }
Heaps: Implementation • In practice, there is no “exchanging” of values as the proper position of an item is located by searching up the tree (insertion) or down the tree (deletion). • Instead a “hole” is moved up or down the tree, by shifting the items along a path down or up single edges of the tree. • The item is written only once, at the last step.
Insert heap.cpp template <typename Etype> void MinHeap<Etype>::insert(Etype const & insElem) { // increase capacity, if necessary if (theHeap.size() == numElements) theHeap.setBounds(1, 2 * theHeap.size()); numElements = numElements + 1; // make a new "bubble" at next cell // to the right int i = numElements; while ((i != 1) && (theHeap[i / 2] > insElem)) { theHeap[i] = theHeap[i / 2]; // swap "bubble" upwards i = i / 2; } // once we exit loop, the "bubble" is where the new element goes theHeap[i] = insElem; }
5 8 9 16 12 10 56 20 18 22 44 13 13 Heaps: DeleteMin • Given the tree below 13 8 9 16 12 10 56 20 18 22 44
Heaps: DeleteMin 8 8 13 9 12 9 16 12 10 56 16 13 10 56 20 18 22 44 20 18 22 44
Heaps: DeleteMin • Step 1: minimum_value = contents of the root node. • Find the rightmost leaf on the lowest level of the tree and call that node x • Copy its contents to the root node and delete the node x • Step 2: Set x to point to root node • Step 3: if (x’s key >= key of any one of x’s children) • Swap contents of x with the minimum child and have x point to that child • Repeat from Step 3 else • Return minimum_value. STOP
Delete_Min heap.cpp template <typename Etype> Etype MinHeap<Etype>::deleteMin() { Assert(numElements > 0, "deleteMin: The heap is empty!"); Etype valueToReturn = theHeap[1]; Etype swapValue = theHeap[numElements]; numElements = numElements - 1; int i = 1; bool stillSwappingDownward = true; while (stillSwappingDownward) { int indexOfMin = i; // if the left child exists and it is less than swapValue if ((2 * i <= numElements) && (theHeap[2 * i] < swapValue)) indexOfMin = 2 * i;
Delete_Min heap.cpp // if the right child exists and it is less than both swapValue // and the left child if (((2 * i) + 1 <= numElements) && (theHeap[(2 * i) + 1] < swapValue) && (theHeap[(2 * i) + 1] < theHeap[2 * i])) indexOfMin = (2 * i) + 1; // if at least one child is less than swapValue if (indexOfMin != i) { theHeap[i] = theHeap[indexOfMin]; i = indexOfMin; } else stillSwappingDownward = false; } theHeap[i] = swapValue; return valueToReturn; }
Find_Min heap.cpp // returns minimum element of the heap without deleting it template <typename Etype> Etype const & MinHeap<Etype>::findMin() const { Assert(numElements > 0, "findMin: The heap is empty!"); return theHeap[1]; }
More on Heaps: BuildHeap • Initialize a Heap in O(n) time instead of doing n Insertions into the heap taking O(n log n) time • Converts an array into a heap • for(i = n/2 to 1) PercolateDown(i) • n/2 represents the first element from the right end of the array that has children
BuildHeap template <typename Etype> MinHeap<Etype>::MinHeap(Array<Etype> const & theData) : numElements(theData.size()), theHeap(theData) { for (int i = numElements/2; i >= 1; --i) { // swap theHeap[i] downward Etype swapValue = theHeap[i]; bool stillSwappingDownward = true; while (stillSwappingDownward) { int indexOfMin = i; // if the left child exists and it is less than swapValue if ((2 * i <= numElements) && (theHeap[2 * i] < swapValue)) indexOfMin = 2 * i; // if the right child exists and it is less than both swapValue // and the left child if (((2 * i) + 1 <= numElements) && (theHeap[(2 * i) + 1] < swapValue) && (theHeap[(2 * i) + 1] < theHeap[2 * i])) indexOfMin = (2 * i) + 1; // if at least one child is less than swapValue if (indexOfMin != i) { theHeap[i] = theHeap[indexOfMin]; i = indexOfMin; } else stillSwappingDownward = false; } theHeap[i] = swapValue; } }
More on Heaps • HeapSort: sort using a Heap • Build a heap using BuildHeap given an array (or even just n Inserts) and then empty it • Running time complexity: O(n log n) • Two types of Heaps • MaxHeaps • MinHeaps
Draw the heap representation of the following priority queue, with a sentinel. Assume all valid keys are non-negative. 7 19 9 20 40 27
The array below is a priority queue. 1 1 5 5 9 9 13 13 8 8 20 20 14 14 14 14 1 2 3 4 5 6 7 2a) What key is the right child of the root? 2b) What key is the left child of key 8? 2c) What key is the parent of key 13? 3) Draw the tree & heap representation after doing the following series of actions to the above priority queue: DeleteMin(), Insert(2), Insert(3)
Discussion Topics • Priority Queues • Operations on Priority Queues • Heap Representation • More about heaps (BuildHeap, HeapSort) • Disjoint Sets • Operations on Disjoint Sets • MakeSet, Union, Find • Up-Tree Representation
Disjoint Sets • Disjoint Sets • We have a fixed set U of Elements Xi • U is divided into a number of disjoint subsets S1, S2 , S3 , … Sk • Si Sj is empty i j • S1 S2 S3 … Sk =U
Disjoint Sets • Operations on Disjoint Sets • MakeSet(X): Return a new set with only item X in it • Union(S,T): Return the set S T, which replaces S and T in the database • Find(X): Return that set S such that X S • If each element can belong to only one set (definition of disjoint), a tree structure known as an up-tree can be used to represent disjoint sets • Up-trees have pointers up the tree from children to parents
C E J G A H D Up-Trees • Properties • Each node has a single pointer to its parent • at the root this field is empty • Nodes can have any number of children • Sets are identified by their root nodes B F Disjoint Sets: {A,C,D,E,G,H,J} and {B,F}
Up-Trees: Union • Union(S, T): Point the root of one tree to the root of the other • If we make root of S point to the root of T, we say we are merging S into T • Optimization: Prevent the linear growth of the height of the tree – always merge the “smaller” tree into the larger one • By height – worst case optimization • By size (# of nodes) – avg case optimization • Increases depth of fewer nodes minimizes expected depth of a node
Up-Trees: Union Optimization • Called the balanced merging strategy • Why avoid linear growth of the tree’s height? • Find takes time proportional to the height of the tree in the worst case • Implementation: Each node has an additional Count field that is used, if the node is the root, to store the # of nodes in the tree (or height) • Running time: O(1) if we assume S and T are roots, else running time of Find(X)
C C B B E E J J F F G G A A H H D D Up-Trees: Union by Size (a) (b) Incorrect Correct
Up-Trees: Find • Find(X): which set does an element belong to? Follow the pointers up the tree to the root • But where is X? • LookUp(X):Get the location of the node X • If we assume we know location of X constant time • Then Running time of Find(X): O(log n) • If we cannot directly access the node logarithmic • E.g. use a balanced tree to index nodes • Then Running time of Find(X): still O(log n) • log time to LookUp node + log time to search up the up-tree
Up-Trees: Find Optimization • Find(X) would take less time in a shallow, bushy tree than it would in a tall, skinny tree • Use balanced merging strategy to prevent the growth in the tree’s height ensures height is at worst logarithmic in size • However, since any number of nodes can have the same parent, we can restructure our up-tree to make it bushier…
Up-Trees: Find Optimization • Path Compression: After doing a Find, make any node on that path to the root point directly to the root. • Any subsequent Find on any one of these nodes or their descendants will take less time since the node is now closer to the root. • Minor “problem” when combined with Union-by-height • Treat height as an estimate Union-by-rank • New Running time of Find: almost constant time – O(log*n)amortized cost per operation
4 9 2 6 3 1 5 7 8 Up-Trees: Array Implementation • If we assume all elements of the universe to be integers from 0 to N, then we can represent the Up-Trees as one Array of size N a 2 4 9 0 2 4 2 6 0 1 2 3 4 5 6 7 8 9 a[i] = parent of i a[root] = 0 or: a[root] = -size
Simple Implementation DisjointSets declaration arrayBasicDisjoint.h class DisjointSets { public: DisjointSets(); DisjointSets(int numElements); void makeSet(); void union(int elem1, int elem2); unsigned int find(int searchElem); private: Array<int> sets; };
Simple Implementation Find arraydisjoint.cpp // Find returns the "name" of the set containing elem. int DisjointSets::find(int searchElem) { if ((searchElem < 1) || (searchElem > sets.size()) Assert("DisjointSets::find(...) : illegal argument for parameter"); else { int index = searchElem; while (sets[index] != 0) index = sets[index]; return index; } }
Simple Implementation Union arraydisjoint.cpp void DisjointSets::union(int first, int second) { if ((first < 1) || (first > sets.size()) Warn("DisjointSets::union(...) : illegal argument for first parameter"); else if ((second < 1) || (second > sets.size()) Warn("DisjointSets::union(...) : illegal argument for second parameter"); else { int rootFirst = find(first); int rootSecond = find(second); if (rootFirst != rootSecond) sets[rootSecond] = rootFirst; } }
Efficient Implementation Find (with path compression) arrayByHeightDisjoint.cpp int DisjointSets::find(int elem) { int index = searchElem; if (sets[index] <= 0) // root node return index; else // sets[index] > 0; not root node { sets[index] = find(sets[index]); return sets[index]; } }
Efficient Implementation arrayBySizeUnion.cpp Union(by size) void DisjointSets::union(int elem1, int elem2) { if ((first < 1) || (first > sets.size()) Warn("DisjointSets::union(...) : illegal argument for first parameter"); else if ((second < 1) || (second > sets.size()) Warn("DisjointSets::union(...) : illegal argument for second parameter"); else { int rootFirst = find(first); int rootSecond = find(second);
Union (by size contd) arrayBySizeUnion.cpp ….. if (rootFirst != rootSecond) // different sets { // size of first set >= size of second set if (sets[rootFirst] <= sets[rootSecond]) { sets[rootFirst] = sets[rootFirst] + sets[rootSecond]; sets[rootSecond] = rootFirst; } else // size of second set > size of first set { sets[rootSecond] = sets[rootSecond] + sets[rootFirst]; sets[rootFirst] = rootSecond; } } }
Draw the array representation of the following disjoint sets (currently depicted as up-trees). 3 7 1 2 6 4 8 5 • What set “name” does Find(4) return?
Draw the up-tree representation of the disjoint set shown below. 9 7 6 0 1 8 1 4 0 1 2 3 4 5 6 7 8 9 • Change the above representation to support union by size. • Draw the new tree & array representation after a Union(5,6) by size, NO path compression. • 6) Draw the new tree & array after a Union(5,6) followed by a Find(2), using path compression
Summary • Priority Queues • Operations on Priority Queues • Heap as a Priority Queue • Array Implementation • FindMin: O(1), Insert: O(lg n), DeleteMin: O(lg n) • More about heaps • BuildHeap: O(n), HeapSort: O(n lg n) • Disjoint Sets • Operations on Disjoint Sets • Up-Tree as a Disjoint Set • Array Implementation • Union: O(log*n), Find: O(log*n)
Practice Problems • You have the following two standard node classes: class ListNode class TreeNode { { public: public: int element; int element; ListNode* next; TreeNode * left; ListNode* prev; TreeNode* right; }; }; Write a function treeToList() which takes as a parameter a pointer to a TreeNode which is the root of a binary search tree. This function should return a ListNode pointer to a list of all the elements in the binary search tree ordered from lowest to highest.