1 / 40

CS 221

CS 221. Analysis of Algorithms Ordered Dictionaries and Search Trees. Portions of these slides come from Michael Goodrich and Roberto Tamassia, Algorithm Design: Foundations, Analysis and Internet Examples, 2002, John Wiley and Sons. and its authors, Michael Goodrich and Roberto Tamassia,

eliana-odom
Download Presentation

CS 221

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS 221 Analysis of Algorithms Ordered Dictionaries and Search Trees

  2. Portions of these slides come from • Michael Goodrich and Roberto Tamassia, Algorithm Design: Foundations, Analysis and Internet Examples, 2002, John Wiley and Sons. • and its authors, Michael Goodrich and Roberto Tamassia, • the books publisher John Wiley & Sons • and… • www.wikipedia.org

  3. Reading material • Goodrich and Tamassia, 2002 • Chapter 2, section 2.5,pages 114-137 • see also section 2.6 • Chapter 3, section 3.1 pages 141-151 • Wikipedia: • http://en.wikipedia.org/wiki/AVL_trees

  4. in the previous episode… • …we defined a data structure which we called a dictionary. It was… • a container to hold multiple objects or in Goodrich and Tamassia’s terminology “items” • each item = a (key, element) pair • element = a “piece” of data • think= name, address, phone number • key = a value we associate the element to help us find, retrieve, delete, etc an element • think = rdbms autoincrement key, student ID#

  5. Dictionaries • Up til now we looked at • Unordered dictionaries • container for (k,e) pairs but… • in no particular order • Logfiles • Hash Tables

  6. Dictionaries • A terminology note • for purposes of our discussion – • A linear unordered dictionary = logfile • A lineary ordered dictionary = lookup table

  7. Game Time • Twenty Questions • One person thinks of an object that can be any person, place or thing… • and does not disclose the selected object until it is specifically identified by the other players… • All other players take turns asking Yes/No questions in an attempt to identify the mystery object

  8. Game Time • Twenty Questions • An efficient problem solving strategy is to ask questions for which the answers will optimally narrow the size of the problem space (possible solutions) • for example, • Q: Is it a person? • A: Yes ….we just eliminated all places and non-human objects from the solution set

  9. Game Time • Twenty Questions • Size of problem? • N=??? large ~∞ • Yes/No attack makes this a binary search problem… • So, what size of problem space can we effectively search? • 220

  10. Game Time • Twenty Questions • Something to think about… • N is conceivably much larger than 220 • So, how is that we can usually solve this problem in 20 steps or less… • i.e. correctly identify the mystery object

  11. Dictionaries • Ordered Dictionaries • suppose the items in a dictionary are ordered (sorted) • like low to high • Would that make a difference in terms of • size() • isEmpty() • findElement() • insertItem() • removeItem()

  12. Dictionaries • Ordered Dictionaries • suppose we implement an ordered dictionary as a linear data structure or more specifically a vector • items are in vector in key order • we gain considerable efficiency because we can visit D[x], where x is a rank in O(1) time • Can we achieve the same time of findElement() time if the ordered dictionary were implemented as a linked list?

  13. Binary Search • Binary search performs operation findElement(k) on a dictionary implemented by means of an array-based sequence, sorted by key • similar to the high-low game • at each step, the number of candidate items is halved • terminates after O(log n) steps • Example: findElement(7) 0 1 3 4 5 7 8 9 11 14 16 18 19 m h l 0 1 3 4 5 7 8 9 11 14 16 18 19 m h l 0 1 3 4 5 7 8 9 11 14 16 18 19 m h l 0 1 3 4 5 7 8 9 11 14 16 18 19 l=m =h

  14. Binary Search • Lookup tables are not very efficient for dynamic data (lot of insertItem, removeElement • Lookup tables are efficient for dictionaries where predominant access is findElement, and relatively little inserts or removes • credit card authorizations, code translation tables,…

  15. Binary Search Tree • Binary tree for holding (k,e) items, such that… • each internal node v store elem e with key k • k of e in left subtree of v <= k of v • k of e in right subtree of v >= k of v • external nodes store no elements… • only placeholder (NULL_NODE)

  16. Binary Search Tree • Each left subtree is less than its parent • Each right subtree is greater than its parent • All leaf nodes hold no items 58 31 90 62 25 42 12 36 75

  17. Search • AlgorithmfindElement(k, v) • ifT.isExternal (v) • returnNO_SUCH_KEY • if k<key(v) • returnfindElement(k, T.leftChild(v)) • else if k=key(v) • returnelement(v) • else{ k>key(v) } • returnfindElement(k, T.rightChild(v)) 6 < 2 9 > = 8 1 4

  18. removeElement(k) – simple case 6 < • To perform operation removeElement(k), we search for key k • Assume key k is in the tree, and let let v be the node storing k • If node v has a leaf child w, we remove v and w from the tree with operation removeAboveExternal(w) • Example: remove 4 2 9 > v 1 4 8 w 5 6 2 9 1 5 8

  19. RemoveElement(k) – more complicated case 1 v • We consider the case where the key k to be removed is stored at a node v whose children are both internal • we find the internal node w that follows v in an inorder traversal • we copy key(w) into node v • we remove node w and its left child z (which must be a leaf) by means of operation removeAboveExternal(z) • Example: remove 3 3 2 8 6 9 w 5 z 1 v 5 2 8 6 9

  20. Binary Search Tree Performance • Consider a dictionary with n items implemented by means of a binary search tree of height h • the space used is O(n) • methods findElement , insertItem and removeElement take O(h) time • The height h is O(n) in the worst case and O(log n) in the best case

  21. Balanced Trees • When a path in a tree gets very long relative to other paths in the tree… • the tree is unbalanced • In fact, in its extreme form an unbalanced tree is a linear list. • So, to achieve optimal performance… • you need to keep the tree balanced

  22. AVL Trees • we want to maintain a balanced tree • recall- • height of a node v = longest path from v to an external node • We want to maintain the principle that • for every node v the height of its children can differ by no more than 1 • Height-Balance Property

  23. AVL Trees • h(right_subtree)-h(left_subtree) = Balance Factor • |h(right_subtree)-h(left_subtree)| = {0,1} • Tree with Balance Factor ≠ {-1,0,1} • Unbalanced Tree • Must be rebalanced • Balance Factor exists for every node v • except (trivially) external nodes

  24. AVL Trees • If Balance Factor = -1,0,1 • tree balanced • does not need restructured • If Balance Factor = -2, 2 • tree unbalanced • needs restructured • restructured done by process called rotation

  25. AVL Trees • Rotation • Four types – but two are symmetrical • Left Single Rotation • Right Single Rotation • Left Double Rotation • Right Double Rotation • Since two are symmetrical –only consider single and double rotation

  26. AVL Trees • Rotation • if BF = 2

  27. AVL Trees • Binary Trees that maintain the Height-Balance Property are called • AVL trees • the name comes from the inventors • G.M. Adelson-Velsky and E.M. Landis in paper entitled “An Algorithm for Information Organization”

  28. AVL Trees Unbalanced Tree Balanced Tree from:http://en.wikipedia.org/wiki/AVL_trees

  29. AVL Trees • h(right_subtree)-h(left_subtree) = Balance Factor (BF) • If BF = {-1,0,1} then tree balanced (do nothing) • If BF ≠{-1,0,1} then tree unbalanced (must be restructured) • Restructuring done by rotation from:http://en.wikipedia.org/wiki/AVL_trees

  30. AVL Trees • Rotation • four cases – but pairs are symmetrical • left single rotation • right single rotation • left double rotation • right double rotation • singe symmetric – we only examine single and double from:http://en.wikipedia.org/wiki/AVL_trees

  31. AVL Trees - Insertion • Rotation • If BF > 2 unbalance occurred further down in right subtree • Recursively walk down subtree until |BF| =2 • If BF < -2 unbalance occurred further down in left subtree • Recursively walk down subtree until |BF| =2 from:http://en.wikipedia.org/wiki/AVL_trees

  32. AVL Trees - Insertion • Rotation • If BF = 2 unbalance occurred in right subtree • Recursively walk down subtree until |BF| =2 • If BF = -2 unbalance occurred in left subtree • Recursively walk down subtree until |BF| =2 from:http://en.wikipedia.org/wiki/AVL_trees

  33. AVL Trees - Insertion • Rotation • If BF = 2 unbalance occurred in right subtree • Step down to subtree to find where insertion occurred • If BF = -2 unbalance occurred in left subtree • Step down to subtree to find where insertion occurred from:http://en.wikipedia.org/wiki/AVL_trees

  34. AVL Trees - Insertion • Rotation • If BF at subtree = 1 • insertion occurred on right leaf node • single rotation required • If BF at subtree = -1 • insertion occurred on left leaf node • double rotation occurred from:http://en.wikipedia.org/wiki/AVL_trees

  35. AVL Trees - Insertion • Rotation • See • http://en.wikipedia.org/wiki/AVL_trees from:http://en.wikipedia.org/wiki/AVL_trees

  36. AVL Trees - Insertion • Performance • rotations – O(1) • Recall h(T) maintained at O(log n) • insertItem – O(log n) • balanced tree - priceless from:http://en.wikipedia.org/wiki/AVL_trees

  37. Bounded –depth Search Trees • Search efficiency in tree is related to the depth of the tree • Can use depth bounded tree to create ordered dictionaries that run in O(log n) for search and update run-time

  38. Multi-way Search Trees • Remember Binary Search Trees • any node v can have at most 2 children • what if we get rid of that rule • Suppose a node could have multiple children (>2) • Terminology – if v has d children – v is a d-node

  39. Multi-way Search Trees • Multi-way Search Tree - T • Each Internal node must have at least two children -- internal node is d-node with d ≥ 2 • Internal nodes store collections of items (k,e) • Each d-node stores d-1 items • Special keys k0 = -∞ and kd = ∞ • External nodes only placeholders

More Related