580 likes | 801 Views
Outline. Scapegoat Trees ( O(log n) amortized time ) 2-4 Trees ( O(log n) worst case time ) Red Black Trees ( O(log n) worst case time ). Review Skiplists and Treaps. So far, we have seen treaps and skiplists Randomized structures
E N D
Outline • Scapegoat Trees ( O(log n) amortized time) • 2-4 Trees ( O(log n) worst case time) • Red Black Trees ( O(log n) worst case time)
Review Skiplists and Treaps • So far, we have seen treaps and skiplists • Randomized structures • Insert/delete/search in O(log n)expected time • Expectation depends on random choices made by the data structure • Cointosses made by a skiplist • Random priorities assigned by a treap
Scapegoat trees • Deterministic data structure • Lazy data structure • Only does work when search paths get too long • Search in O(log n)worst-case time • Insert/delete in O(log n)amortized time • Starting with an empty scapegoat tree, a sequence of minsertions and deletions takes O(mlog n) time
Scapegoat philosophy • We follow a simple strategy. 15 • If the tree is not optimal rebuild. • Is this a good binary search tree? 16 7 3 11 1 5 9 13 0 2 4 6 8 10 12 14 • It has 17nodes and 5levels • Any binary tree with 17 nodes has at least 5 levels (A binary tree with 4 levels has at most 24 - 1 = 15 nodes) • This is an “optimal" binary search tree.
Scapegoat philosophy • Rebuild the tree cost O(n) time • We cannot do it to often if we want to keep the order of O(log n) amortized time. • Scapegoattrees keep two counters: • How to know when we need to rebuild the tree? • n: the number of items in the tree (size) • q: an overestimate of n • We maintain the following two invariants: • q/2 ≤ n ≤ q • No node has depth greater than log3/2 q
Search and Delete • How can we perform a search in a Scapegoat tree? • How can we delete a value x from a Scapegoat tree? • run the standard deletion algorithm for binary search trees. • decrement n • if n < q/2 then • rebuild the entire tree and set q=n • How can we insert a value x into a Scapegoat tree?
Insert • How can we insert a value x into a Scapegoattree? • To insert the value x into a ScapegoatTree: • Create a node u and insert in the normal way. • Increment n and q • If the depth of u is greater than log3/2 q, then • Walk up to the root from u until reaching a node w with size(w) > (2/3) size(w:parent) • Rebuild the subtree rooted at w.parent
Inserting into a Scapegoat tree( easy case ) n = q = 11 n = q = 10 u=3.5 5 8 2 3.5 1 4 7 9 0 3 6 u • Create a node u and insert in the normal way. • Increment n and q • depth(u)= 4 ≤ log3/2 q = 5.913
Inserting into a Scapegoat tree( bad case ) 7 n = q = 11 6 8 u=3.5 5 9 d(u) = 6 > log3/2 q = 5.913 2 1 4 0 3 1 ≤ (2/3)2 = 1.33 w 3.5 size(w)>(2/3)size(w.parent)
Inserting into a Scapegoat tree( bad case ) 7 n = q = 11 6 8 u=3.5 5 9 d(u) = 6 > log3/2 q = 5.913 2 1 4 w 2 ≤ (2/3)3 = 2 0 3 3.5 size(w)>(2/3)size(w.parent)
Inserting into a Scapegoat tree( bad case ) 7 n = q = 11 6 8 u=3.5 5 9 d(u) = 6 > log3/2 q = 5.913 2 1 4 3 ≤ (2/3)6 = 4 w 0 3 3.5 size(w)>(2/3)size(w.parent)
Inserting into a Scapegoat tree( bad case ) 7 n = q = 11 6 8 u=3.5 5 9 d(u) = 6 > log3/2 q = 5.913 2 w 1 4 6 > (2/3)7 = 4.67 0 3 ( Scapegoat ) 3.5 size(w)>(2/3)size(w.parent)
Inserting into a Scapegoat tree( bad case ) 7 n = q = 11 6 8 u=3.5 3 9 4 1 0 2 3.5 5 • How can we be sure that the scapegoat node always exist?
Why is there always a scapegoat? • Lemma: if d > log3/2 q then there exists a scapegoat node. • Proofby contradiction • Assume (for contradiction) that we don't find a scapegoat node. • Then size(w) ≤ (2/3) size(w.parent) for all nodes w on the path to u • The size of a node at depth i is at most n(2/3)I • But d > log3/2 q ≥ log3/2 n, so • size(u) ≤ n(2/3)d < n(2/3)log3/2 n n = n/n = 1 • Contradiction! (Since size(u)=1) So there must be a scapegoat node.
Summary • So far, we know • Insert and delete maintain the invariants: • the depth of any node is at most log3/2 q • q < 2n • So the depth of any node is most • log3/2 2n ≤ 2 + log3/2 n • So, we can search in a scapegoat tree in O(log n) time • Some issues still to resolve • How do we keep track of size(w) for each node w? • How much time is spent rebuilding nodes during deletion and insertion?
Keeping track of the size • There are two possible solutions: • Solution 1: Each node keeps an extra counter for its size • During insertion, each node on the path to u gets its counter incremented • During deletion, each node on the path to u gets its counter decremented • We calculate sizes bottom-up during a rebuild • Solution 2: Each node doesn't keep an extra counter for its size
(Not) keeping track of the size • We only need the size(w) while looking for a scapegoat • Knowing size(w), we can compute size(w.parent) by traversing the subtree rooted at sibling(w) 7 • So, in O(size(v)), we know all sizes up to the scapegoat node time 6 8 5 9 • But we do O(size(v)) work when we rebuild v anyway, so this doesn't add anything to the cost of rebuilding 2 1 4 0 3 3.5
Analysis of deletion • When deleting, if n < q/2, then we rebuild the whole tree • This takes O(n) time • If n < q/2 then we have done at least q - n > n/2 deletions • The amortized (average) cost of rebuilding (due to deletions) is O(1) per deletion
Analysis of insertion • If no rebuild is necessary the cost of the insertion is log( n ) • After rebuilding a sub tree containing node v, both of its children have de same size*. • If the subtree rooted in v has size n we needed at least n/3insertion the previous rebuilding process. • The rebuild cost n(log n) operations • Thus the cost of the insertion is O(log n) amortized time.
Scapegoat trees summary • Theorem: • The cost to search in a scapegoat tree is O(log n) in the worst-case. • The cost of insertion and deletion in a scapegoat tree are O(log n) amortized time per operation. • Scapegoat trees often work even better than expected • If we get lucky, then no rebuilding is required
Review: Maintaining Sorted Sets • We have seen the following data structures for implementing a SortedSet • Skiplists: find(x)/add(x)/remove(x) in • O(log n)expected time per operation • Treaps: find(x)/add(x)/remove(x) in • O(log n)expected time per operation • Scapegoattrees: find(x) in • O(log n)worst-case time per operation, • add(x)/remove(x) in • O(log n)amortized time per operation
Review: Maintaining Sorted Sets • No data structures course would be complete without covering • 2-4 trees: find(x)/add(x)/remove(x) in • O(log n) worst-case time per operation • Red-blacktrees: find(x)/add(x)/remove(x) in O(log n) worst-case time per operation
The height of 2-4 Trees • A 2-4 tree is a tree in which • Each internal node has 2, 3, or 4children • All the leaves are at the same level
Binary Trees • Lemma: A 2-4 tree of heighth≥ 0 has at least 2hleaves • Proof: The number of nodes at level i is at least 2i • Corollary: A 2-4 tree with n > 0 leaves has height at most log2 n • Proof: n ≥ 2h↔ log2≥ h ≥20=1 ≥21=2 ≥22=4 ≥23=8
Add a leaf to a 2-4 Trees • To add a leafw as a child of a node u in a 2-4 tree: • Addw as a child of u
Add a leaf to a 2-4 Trees • To add a leafw as a child of a node u in a 2-4 tree: • Addw as a child of u • While u has 5 children do: • Splitu into two nodes with 2 and 3 children, respectively, and make them children of u.parent • Set u = u.parent • If root was split, create new root with 2 children • This runs in O(h) = O(log n) time
Deleting a leaf to a 2-4 Trees • To delete a leaf w from a 2-4 tree: • Removew from its parent u • While u has 1 child and u != root • If u has a sibling v with 3 or more children then borrow a child from v • Else mergeu with its sibling v, remove v from u.parent and set u = u.parent • If u == root and u has 1 child, then set root = u.child[0]
Deleting a leaf to a 2-4 Trees • To delete a leaf w from a 2-4 tree: • Removew from its parent u • While u has 1 child and u != root • If u has a sibling v with 3 or more children then borrow a child from v • Else mergeu with its sibling v, remove v from u.parent and set u = u.parent • If u == root and u has 1 child, then set root = u.child[0] • This runs in • O(h) = O(log n) time
2-4 trees can act as search trees 3-5 6-7-8 0-1-2 -4- 0 1 2 3 8 9 4 5 6 7 • How? • All n keys are stored in the leaves • Internal nodes store 1, 2, or 3values to direct searches to the correct subtree • Searches take O(h) = O(log n) time • Theorem: A 2-4 tree supports the operations find(x), • add(x), and remove(x) in O(log n)time per operation
Red-Black Trees • 2-4 trees are nice, but they aren't binary trees • How can we made it binary Red-blacktrees binary version of 2-4 trees
Red-Black Trees • A red-black tree is a binary search tree in which each node is colored red or black Each red node has 2black children • The number of black nodes on every root-to-leaf path is the same • null (external) nodes are considered black • the root is always black
Red-Black trees and 2-4 trees • A red-black tree is an encoding of 2-4 tree as a binary tree • Red nodes are “virtual nodes" that allow 3 and 4 children per black node
The height of Red-Black Trees • Each red node has 2black children • The number of black nodes on every root-to-leaf path is the same • Red-black trees properties: • Theorem: • A red-black tree with n nodes has height at most: 2 log2(n + 1) • A red-black tree is an encoding of a 2-4 tree with n + 1 leaves • Black height is at most log2(n + 1) • Red nodes at most double this height
Red-Black Trees • Adding and removing in a red-black tree simulates adding/deleting in a 2-4 tree
Red-Black Trees • Adding and removing in a red-black tree simulates adding/deleting in a 2-4 tree • This results in a lot of cases • To get fewer cases, we add an extra property: • If u has a red child then u.left is red
Adding to a read black tree • To add a new value to a red-black tree: • create a new red node u and insert it as usual (as a leaf) • callinsertFixup(u) to restore • no red-red edge • if u has a red child then u.left is red • Each iteration of addFixup(u) moves u up in tree • Finishes after O(log n) iterations in O(log n) time
Insertion cases • Case 1:The new node N is the root. • We color N as black. N N • All the properties are still satisfied. ? ? ? ? ? ?
Insertion cases • Case 2:The parent P of the new node N is Black. P • All the properties are still satisfied. N ? ? ? ? ?
Insertion cases • Case 3:The parent P of the new node N and the uncle U are both red. G G • Red property is not satisfied. U P U P • P and U become blacks. N ? ? ? • Path property is not satisfied. ? ? • P`s parent G become red. • Are all the properties satisfied now?. • The process is repeated recursively until reach case 1
Insertion cases • Case 4:The parent P of the new node N is red but the uncle U is black. P is the left child of G and N is the left child of P. • Rotate to the right P. P P G G G P N U 1 2 3 N 3 4 5 U 4 5 1 2 • Change colors of P and G.
Insertion cases • Case 5:The parent P of the new node N is red but the uncle U is black. P is the left child of G and N is the right child of P. • Rotate to the left N and reach case 4 G G P N U U 1 N 4 5 P 3 4 5 2 3 1 2
Removing from a read black tree • To remove a value from a red-black tree: • remove a node w with 0 or 1 children • set u=w.parent and make ublacker • red becomes black • black becomes double-black • call removeFixup(u) to restore • no double-black nodes • if u has a red child then u.left is red • Each iteration of addFixup(u) moves u up in tree • Finishes after O(log n) iterations in O(log n) time
Removing simple cases • If the node N to be removed has two children we change it from its successor and remove the successor (as in any binary tree). • We can assume N has at most one child. N N • If N is red just remove it. ? ? • If N`s child is red color it black and remove N. N • All the properties are still satisfied. ? ?
Removing complex cases • Both N and its child are black • We remove N and replace it by its children • (we will call now N to its child and S to its new brother). P P N S C S N C ? ? ? ? ? ? ? ?
Insertion cases • Case 1:N is the new root. • Everything is done. • All the properties are satisfied. N N ? ? ? ? ? ?
Insertion cases • Case 2:The node S is red. • Rotate to the left S and swap colors between S and P P S S N P P Sr S 5 6 1 2 Sl Sr N Sl 3 4 5 6 1 2 3 4 • Is the path property satisfied? • We pass to case 4, 5 or 6.
Insertion cases • Case 3:All N, P, S and the children of S are black. • We color S as red. P P N N S S 1 2 1 2 Sl Sl Sr Sr 3 4 5 6 3 4 5 6 • Is the path property satisfied? • We recursively repeat the checking process with node P
Insertion cases • Case 4: N, S and the children of S are black but P is red. • We swap the colors of nodes S and P. P P N N S S 1 2 1 2 Sl Sl Sr Sr 3 4 5 6 3 4 5 6 • Is the path property satisfied? • Yes all the properties are satisfied. Why?
Insertion cases • Case 5: N is a left child of P and S and its right child are black but its left child is black • We rotate to the right at S. P P Sl Sl N N S S S 1 2 1 2 Sl Sr Sr 3 4 5 6 3 4 5 6 • We swap colors of S and its parent. • We move to the case 6
Insertion cases • Case 5: N is a left child of P and S is black and its right child is red. • We rotate to the left at P. S S P N P P Sr Sr S 1 2 3 Sr N 4 5 3 4 5 1 2 • Set the right child of S to black and swap colors of P and S. • All the properties are satisfied.