920 likes | 935 Views
CSE 326: Data Structures Part Four: Trees. Henry Kautz Autumn 2002. Material. Weiss Chapter 4: N-ary trees Binary Search Trees AVL Trees Splay Trees. Other Applications of Trees?. Tree Jargon. Length of a path = number of edges Depth of a node N = length of path from root to N
E N D
CSE 326: Data StructuresPart Four: Trees Henry Kautz Autumn 2002
Material Weiss Chapter 4: • N-ary trees • Binary Search Trees • AVL Trees • Splay Trees
Tree Jargon • Length of a path = number of edges • Depth of a node N = length of path from root to N • Height of node N = length of longest path from N to a leaf • Depth and height of tree = height of root depth=0, height = 2 A C D B F depth = 2, height=0 E
Definition and Tree Trivia Recursive Definition of a Tree: A tree is a set of nodes that is a. an empty set of nodes, or b. has one node called the root from which zero or more trees (subtrees) descend. • A tree with N nodes always has ___ edges • Two nodes in a tree have at most how many paths between them?
Implementation of Trees • Obvious Pointer-Based Implementation: Node with value and pointers to children • Problem? A C D B F E
1st Child/Next Sibling Representation • Each node has 2 pointers: one to its first child and one to next sibling A A C D B C D B F E F E
Nested List Implementation 1 Tree := ( label {Tree}* ) a b d c
Nested List Implementation 2 Tree := label || (label {Tree}+ ) a b d c
Application: Arithmetic Expression Trees Example Arithmetic Expression: A + (B * (C / D) ) Tree for the above expression: + A * • Used in most compilers • No parenthesis need – use tree structure • Can speed up calculations e.g. replace • / node with C/D if C and D are known • Calculate by traversing tree (how?) B / D C
Traversing Trees + • Preorder: Root, then Children • + A * B / C D • Postorder: Children, then Root • A B C D / * + • Inorder: Left child, Root, Right child • A + B * C / D A * B / D C
Example Code for Recursive Preorder void print_preorder ( TreeNode T) { TreeNode P; if ( T == NULL ) return; else { print_element(T.Element); P = T.FirstChild; while (P != NULL) { print_preorder ( P ); P = P.NextSibling; } } } What is the running time for a tree with N nodes?
Data left pointer right pointer Binary Trees • Properties Notation:depth(tree) = MAX {depth(leaf)} = height(root) • max # of leaves = 2depth(tree) • max # of nodes = 2depth(tree)+1 – 1 • max depth = n-1 • average depth for n nodes = (over all possible binary trees) • Representation: A B C D E F G H I J
Operations create destroy insert find delete Dictionary: Stores values associated with user-specified keys keys may be any (homogenous) comparable type values may be any (homogenous) type implementation: data field is a struct with two parts Search ADT: keys = values kim chi spicy cabbage kreplach tasty stuffed dough kiwi Australian fruit Dictionary & Search ADTs insert • kohlrabi • - upscale tuber find(kreplach) • kreplach • - tasty stuffed dough
Naïve Implementations Goal: fast find like sorted array, dynamic inserts/deletes like linked list
Naïve Implementations Goal: fast find like sorted array, dynamic inserts/deletes like linked list
Search tree property all keys in left subtree smaller than root’s key all keys in right subtree larger than root’s key result: easy to find any given key inserts/deletes by changing links Binary Search Tree Dictionary Data Structure 8 5 11 2 6 10 12 4 7 9 14 13
In Order Listing visit left subtree visit node visit right subtree 10 5 15 2 9 20 17 7 30 In order listing:
In Order Listing visit left subtree visit node visit right subtree 10 5 15 2 9 20 17 7 30 In order listing: 25791015172030
Finding a Node Node find(Comparable x, Node root) { if (root == NULL) return root; else if (x < root.key) return find(x,root.left); else if (x > root.key) return find(x, root.right); else return root; } 10 5 15 2 9 20 17 7 30 runtime:
Insert Concept: proceed down tree as in Find; if new key not found, then insert a new node at last spot traversed void insert(Comparable x, Node root) { // Does not work for empty tree – when root is NULL if (x < root.key){ if (root.left == NULL) root.left = new Node(x); else insert( x, root.left ); } else if (x > root.key){ if (root.right == NULL) root.right = new Node(x); else insert( x, root.right ); } }
Time to Build a Tree Suppose a1, a2, …, an are inserted into an initially empty BST: • a1, a2, …, an are in increasing order • a1, a2, …, an are in decreasing order • a1 is the median of all, a2 is the median of elements less than a1, a3 is the median of elements greater than a1, etc. • data is randomly ordered
Analysis of BuildTree • Increasing / Decreasing: (n2) 1 + 2 + 3 + … + n = (n2) • Medians – yields perfectly balanced tree (n log n) • Average case assuming all input sequences are equally likely is (n log n) • equivalently: average depth of a node is log nTotal time = sum of depths of nodes
Proof that Average Depth of a Node in a BST Constructed from Random Data is (log n) Method: Calculate sum of all depths, divide by number of nodes • D(n) = sum of depths of all nodes in a random BST containing n nodes • D(n) = D(left subtree)+D(right subtree) + adjustment for distance from root to subtree + depth of root • D(n) = D(left subtree)+D(right subtree) + (number of nodes in left and right subtrees) + 0 • D(n) = D(L)+D(n-L-1)+(n-1)
Random BST, cont. • D(n) = D(L)+D(n-L-1)+(n-1) • For random data, all subtree sizes equally likely this looks just like the Quicksort average case equation!
Random Input vs. Random Trees Trees Inputs 1,2,3 3,2,1 1,3,2 3,1,2 2,1,3 2,3,1 For three items, the shallowest tree is twice as likely as any other – effect grows as n increases. For n=4, probability of getting a shallow tree > 50%
Deletion 10 5 15 2 9 20 17 7 30 Why might deletion be harder than insertion?
Node min(Node root) { if (root.left == NULL) return root; else return min(root.left); } FindMin/FindMax 10 5 15 2 9 20 17 7 30 How many children can the min of a node have?
Successor Find the next larger node in this node’s subtree. • not next larger in entire tree Node succ(Node root) { if (root.right == NULL) return NULL; else return min(root.right); } 10 5 15 2 9 20 17 7 30 How many children can the successor of a node have?
Deletion - Leaf Case Delete(17) 10 5 15 2 9 20 17 7 30
Deletion - One Child Case Delete(15) 10 5 15 2 9 20 7 30
Deletion - Two Child Case Delete(5) 10 5 20 2 9 30 7 replace node with value guaranteed to be between the left and right subtrees: the successor Could we have used the predecessor instead?
Deletion - Two Child Case Delete(5) 10 5 20 2 9 30 7 always easy to delete the successor – always has either 0 or 1 children!
Deletion - Two Child Case Delete(5) 10 7 20 2 9 30 7 Finally copy data value from deleted successor into original node
Lazy Deletion • Instead of physically deleting nodes, just mark them as deleted • simpler • physical deletions done in batches • some adds just flip deleted flag • extra memory for deleted flag • many lazy deletions slow finds • some operations may have to be modified (e.g., min and max) 10 5 15 2 9 20 17 7 30
Dictionary Implementations BST’s looking good for shallow trees, i.e. the depth D is small (log n), otherwise as bad as a linked list!
CSE 326: Data StructuresPart 3: Trees, continuedBalancing Act Henry Kautz Autumn Quarter 2002
Beauty is Only (log n) Deep • Binary Search Trees are fast if they’re shallow e.g.: complete • Problems occur when one branch is much longer than the other How to capture the notion of a “sort of” complete tree?
Balance t balance = height(left subtree) - height(right subtree) • convention: height of a “null” subtree is -1 • zero everywhereperfectly balanced • small everywherebalanced enough: (log n) • Precisely: Maximum depth is 1.44 log n 6 5
Binary search tree properties Balance of every node is -1b 1 Tree re-balances itself after every insert or delete AVL Tree Dictionary Data Structure 8 5 11 2 6 10 12 4 7 9 13 14 15 What is the balance of each node in this tree?
AVL Tree Data Structure 10 data 3 3 height 10 children 1 2 5 15 0 0 1 0 12 20 2 9 0 0 17 30
Not An AVL Tree 10 data 4 4 height 10 children 1 3 5 15 0 0 2 0 12 20 2 9 1 0 17 30 0 18
Bad Case #1 Insert(small) Insert(middle) Insert(tall) 2 S 1 M 0 T
Single Rotation 2 1 S M 1 M 0 0 S T 0 T Basic operation used in AVL trees: A right child could legally have its parent as its left child.
General Case: Insert Unbalances h + 1 h + 2 a a h - 1 h + 1 h - 1 h b X b X h h-1 h - 1 h - 1 Z Y Z Y h + 1 b h h a Z h - 1 h - 1 Y X
Properties of General Insert + Single Rotation • Restores balance to a lowest point in tree where imbalance occurs • After rotation, height of the subtree (in the example, h+1) is the same as it was before the insert that imbalanced it • Thus, no further rotations are needed anywhere in the tree!
Bad Case #2 Insert(small) Insert(tall) Insert(middle) 2 S 1 T Why won’t a single rotation (bringing T up to the top) fix this? 0 M
Double Rotation 2 2 S S 1 M 1 1 M T 0 0 0 S T 0 T M
General Double Rotation h + 3 a h + 2 h + 2 c h b Z h+1 h+1 b a h h+1 W h h h c Y X W Z h Y X • Initially: insert into X unbalances tree (root height goes to h+3) • “Zig zag” to pull up c – restores root height to h+2, left subtree height to h