The complexity and correctness of algorithms (with binary trees as an example)

The complexity and correctness of algorithms(with binary trees as an example)

An algorithm is an unambiguous sequence of simple, mechanically executable instructions.

Trees • A tree is a data structure used to represent different kinds of data and help solve a number of algorithmic problems • Game trees (i.e., chess ), UNIX directory trees, sorting trees etc. • We will study at least two kinds of trees: Binary Search Trees and Red-Black Trees.

Trees • Trees have nodes. They also have edgesthat connect the nodes. Between two nodes there is alwaysonly one path. Tree nodes Tree edges

Trees • Trees are rooted. Once the root is defined (by the user) all nodes have a specific level. • Trees have internal nodes and leaves. Every node (except the root) has a parent and it also has zero or more children. root level 0 level 1 internal nodes level 2 parent and child level 3 leaves

Binary trees • A binary tree is a tree such that each node has at most 2 children.

26 200 28 190 213 18 12 24 27 Binary trees Recursive definition (important): • An empty tree is a binary tree • A node with two child sub-trees is a binary tree • [often implicit] Only what you get from 1 by a finite number of applications of 2 is a binary tree. Values stored at tree nodes are called keys. 56

Binary search trees • A binary search tree (BST) is a binary tree with the following properties: • The key of a node is always greater than the keys of the nodes in its left subtree. • The key of a node is always smaller than the keys of the nodes in its right subtree.

Stored keys must satisfy the binary search tree property.  y in left subtree of x, then key[y]  key[x].  y in right subtree of x, then key[y]  key[x]. 26 200 28 190 213 18 12 24 27 Binary search trees 56

14 10 16 8 11 15 18 14 10 16 8 11 15 Binary search trees: examples root root C A D root

Binary search trees • Data structures that can support dynamic set operations. • Search, minimum, maximum, predecessor, successor, insert, and delete. • Can be used to build • Dictionaries. • Priority Queues.

BST – Representation • Represented by a linked data structure of nodes. • root(T) points to the root of tree T. • Each node contains fields: • key • left – pointer to left child: root of left subtree. • right – pointer to right child : root of right subtree. • p – pointer to parent; p[root[T]] = NIL (optional).

26 200 28 190 213 18 12 24 27 Inorder traversal of a BST The binary search tree property allows the keys of a binary search tree to be printed, in (monotonically increasing) order, recursively. Inorder-Tree-Walk (p) if pNIL then Inorder-Tree-Walk(left[p]) print key[x] Inorder-Tree-Walk(right[p]) 56 • Can you prove the correctness on in-order traversal? • How long does an in-order walk take?

Correctness of inorder traversal • Must prove that it prints all elements, in order, and that it terminates. • By induction on size of tree. Size=0: Easy. • Size >1: • Prints left subtree in order by induction. • Prints root, which comes after all elements in left subtree (still in order). • Prints right subtree in order (all elements come after root, so still in order).

Notice how we used the recursive definition of a tree in our inductive proof. We exploit the recursive structure of a tree, and this approach - which is general to all recursive definitions, and not restricted to trees - is called structural induction.

26 200 28 190 213 18 56 12 24 27 Searching in a BST Tree-Search(x, k) if x = NIL or k = key[x] then return x if k < key[x] then return Tree-Search(left[x], k) else return Tree-Search(right[x], k) Height 4 Height 3 Height 2 Height 1 Time complexity is proportional with h is the height of the tree.

Tree-Insert(Tree T, int z) y  NIL x  root[T] while x  NIL do y  x if key[z] < key[x] then x  left[x] else x  right[x] p[z]  y if y = NIL then root[t]  z else if key[z] < key[y] then left[y]  z else right[y]  z Change the dynamic set represented by a BST. Ensure the binary search tree property holds after change. Insertion is easier than deletion. 26 200 28 190 213 18 56 12 24 27 Inserting an element in a BST

Initialization: constant time While loop in lines 3-7 searches for place to insert z, maintaining parent y.This takes time proportional with h Lines 8-13 insert the value: constant time TOTAL: C1 + C2 + C3*h Analysis of Insertion Tree-Insert(T, z) y  NIL x  root[T] while x  NIL do y  x if key[z] < key[x] then x  left[x] else x  right[x] p[z]  y if y = NIL then root[t]  z else if key[z] < key[y] then left[y]  z else right[y]  z

15 15 5 16 16 5 20 20 z 3 3 12 12 10 13 18 18 23 23 10 delete 6 6 7 7 Deletion • Goal: Delete a given node z from a binary search tree. • We need to consider several cases. • Case 1: z has no children. • Delete z by making the parent of z point to NIL, instead of to z.

15 15 delete 5 5 20 16 10 13 10 18 23 20 6 6 3 3 12 12 18 23 7 7 Deletion • Case 2: zhas one child. • Delete z by making the parent of z point to z’s child, instead of to z. • Update the parent of z’s child to be z’s parent. z

6 15 15 z delete 16 16 5 6 20 20 3 3 12 12 18 18 23 23 10 13 10 13 6 7 7 Deletion • Case 3: z has two children. • z’s successor (y) is the minimum node in z’s right subtree. • y has either no children or one right child (but no left child). • Why? • Delete y from the tree (via Case 1 or 2). • Replace z’s key and satellite data with y’s. y

Successor node • The successor of node x is the node y such that key[y] is the smallest key greater than key[x]. • The successor of the largest key is NIL. • Search for a successor consists of two cases. • If node x has a non-empty right subtree, then x’s successor is the minimum in the right subtree of x. • If node x has an empty right subtree, then: • As long as we move to the left up the tree (move up through right children), we are visiting smaller keys. • x’s successor y is the node that x is the predecessor of (x is the maximum in y’s left subtree). • In other words, x’s successor y, is the lowest ancestor of x whose left child is also an ancestor of x. • We can define the predecessor node similarly.

Exercise: Sorting using BSTs Sort (A) for i 1 to n do tree-insert(A[i]) inorder-tree-walk(root) • What are the worst case and best case running times? • In practice, how would this compare to other sorting algorithms?

Wrap-up • Determining the complexity of algorithms is usually the first step in obtaining better algorithms. • Or in realizing we cannot do any better. • What should you know? • Inductive proofs on trees. • Binary search trees and operations on BSTs. • Height of trees and how it influences efficiency.

The complexity and correctness of algorithms (with binary trees as an example)