320 likes | 331 Views
Explore ways to balance trees, including heaps and B-Trees. Learn about time analysis and operations for balanced trees.
E N D
Data Structures and Algorithms for Information Processing Lecture 6: Heaps, B-Trees, and B+Trees Lecture 6: Heaps & B-Trees
Homework Policy • Late homework will normally be penalized 10% per day late; • Each student may turn in one late homework with no penalty (up to one week late) Lecture 6: Heaps & B-Trees
Grading • Homeworks (4-5) 50% • Midterm Exam 25% • Final Exam 25% Lecture 6: Heaps & B-Trees
Today’s Topics • Ways to Balance Trees • Heaps & Priority Queues • B-Trees • Time Analysis of Trees • Binary trees • Heaps • B-Trees • See Chapter 10 in Main • B+ trees Lecture 6: Heaps & B-Trees
1 2 3 4 5 6 Binary Trees: Worst Case Inserting nodes that are already sorted leads to worst-case behavior: d = (n - 1) = 5 How can we use the idea of balanced trees to avoid this kind of situation? Lecture 6: Heaps & B-Trees
4 1 4 2 6 2 2 6 2 3 1 3 5 7 1 1 3 5 1 3 4 2 4 2 5 1 1 3 Balanced Trees Trees are “no deeper than they have to be” Heaps are complete binary trees which limit the depth to a minimum for any given n nodes, independently of the order of insertion. Heaps are not search trees. Complete binary trees minimize depth by forcing each row to be full before d is increased Main’s slides on Heaps Lecture 6: Heaps & B-Trees
B-Trees • B-Trees are a type of search tree • Further reduction in depth for a given tree of n nodes • Two adjustments: • nodes have more than two children • nodes hold more than a single element Lecture 6: Heaps & B-Trees
B-Trees • Can be implemented as a set (no duplicate elements) or as a bag (duplicate elements allowed) • This example focuses on the set implementation Lecture 6: Heaps & B-Trees
B-Trees • Every B-Tree depends on a positive constant, MINIMUM, which determines how many elements are held in a single node • Rule 1: The root may have as few as 0 or 1 elements; all other nodes have at least MINIMUM elements Lecture 6: Heaps & B-Trees
B-Trees • Rule 2: The maximum number of elements in a node is twice the value of MINIMUM • Rule 3: Elements in a node are stored in a partially-filled array, sorted from smallest (element 0) to largest (final position used) Lecture 6: Heaps & B-Trees
B-Trees • Rule 4: The number of subtrees below a non-leaf node is always one more than the number of elements in the node Lecture 6: Heaps & B-Trees
B-Trees • Rule 5: For any non-leaf node: • The element at index I is greater than all the elements in subtree number I of the node • An element at index I is less than all the elements in subtree (I + 1) of the node Lecture 6: Heaps & B-Trees
93 and 107 Subtree Number 0 Subtree Number 1 Subtree Number 2 B-Trees Each element in subtree 2 is greater than 107. Each element in subtree 0 is less than 93. Each element in subtree 1 is between 93 & 107. Lecture 6: Heaps & B-Trees
B-Trees • Rule 6: Every leaf in a B-Tree has the same depth • The implication is that B-Trees are always balanced. Lecture 6: Heaps & B-Trees
6 2 and 4 9 2 and 4 9 1 3 5 7 and 8 10 1 3 5 7 and 8 10 B-Tree Example NOTE: Every child of the root node is also a B-Tree! MINIMUM = 1 Lecture 6: Heaps & B-Trees
Set ADT with B-Trees public class IntBalancedSet {// constants private static final MINIMUM = 200; private static final MAXIMUM = 2 * MINIMUM; // info about root node int dataCount; int[] data = new int[MAXIMUM + 1]; int childCount; // info about children IntBalancedSet[] subset = new IntBalancedSet [MAXIMUM+2]; …} Lecture 6: Heaps & B-Trees
6 2 and 4 9 1 3 5 7 and 8 10 MINIMUM = 1 MAXIMUM = 2 data 6 ? ? dataCount 1 childCount subset null null 2 [References to IntBalancedSet instances] Lecture 6: Heaps & B-Trees
Invariant for Set B-Tree • The elements of the set are stored in a B-Tree, satisfying the 6 rules • The number of elements in the root is stored in the instance variable dataCount, and the number of subtrees is stored in the instance variable childCount. Lecture 6: Heaps & B-Trees
Invariant for Set B-Tree • The root’s elements are stored in data[0] throughdata[dataCount - 1] . • If the root has subtrees, then subset[0] through subset[childCount - 1] are references to those subtrees. Lecture 6: Heaps & B-Trees
Searching a B-Tree • Sets use the method contains to find if an element is in the set: • Set I equal to the first index I where data[I]>=target; otherwise I = dataCount • If data[I] == target, return true;else if (no children) return false;else return subset[I].contains(target); Lecture 6: Heaps & B-Trees
6 2 and 4 9 1 3 5 7 and 8 10 Sample Search contains(7); 7 > 6, so I = dataCount = 1 Subset[1].contains(7); 9>=7, so I = 0; data[I] != 7 Subset[0].contains(7); 7>=7, soI = 0; data[I] = 7! Lecture 6: Heaps & B-Trees
Add/Remove from B-Tree • Complex two-pass operations • pp. 500-512 • Covered on next slide set for 2-3 trees Lecture 6: Heaps & B-Trees
Trees, Logs, Time Analysis • Heaps and B-Trees are efficient because d is kept small • How can we relate the depth of a tree and the worst-case time required to search, add, and remove an element? Lecture 6: Heaps & B-Trees
Trees, Logs, Time Analysis • The worst case time performance for the following operations are all O(d): • Adding an element to a binary search tree, heap, or B-Tree • Removing an element from a binary search tree, heap or B-Tree • Searching for a specified element in a binary search tree or B-Tree Lecture 6: Heaps & B-Trees
Trees, Logs, Time Analysis • How can we relate the depth d to the number of elements n? • Example: binary trees • d is no more than n - 1 • O(d) is therefore O(n - 1) = O(n)(remember, we can ignore constants) Lecture 6: Heaps & B-Trees
Time Analysis for Heaps • Heaps • Level Nodes to Fill 0 1 1 2 2 4 3 8… … d 2^d Lecture 6: Heaps & B-Trees
Time Analysis for Heaps • Minimum nodes to reach depth d in a heap: • The number of nodes in a heap is at least Lecture 6: Heaps & B-Trees
Review Base-2 Logarithms • For any positive number x, the base 2 logarithm of x is an exponent r such that: Lecture 6: Heaps & B-Trees
Review Base-2 Logarithms Lecture 6: Heaps & B-Trees
Worst-Case For Heaps • In a heap the number of elements n is at least 2^d Lecture 6: Heaps & B-Trees
Worse-Case For Heaps • Adding or removing an element in a heap with n elements is O(d) where d is the depth of the tree. Because d is no more than log2(n), the operations are O(log2(n)), which is O(log(n)). • (see discussion p. 516-520) Lecture 6: Heaps & B-Trees
Many Databases use B+ Trees From Wikipedia Lecture 6: Heaps & B-Trees