15-211 Fundamental Data Structures and Algorithms

15-211Fundamental Data Structures and Algorithms Splay Trees Ananda Guna February 8, 2005

Announcements • Homework 2 available! • Due Monday, Feb.16, 11:59pm! • More involved than hw1? Perhaps. • Start early! • It is very importantthat you go to recitations: • We will discuss splaying

Splay Trees (self adjusting trees)

Balanced BST’s • O(log n) performance • Best, average and worst cases • Need to store balance information per node • expensive • Difficult to implement • Insertions and deletions are difficult • Even for easy inputs we still have O(log n) behavior • No cost savings

Binary search trees • Simple binary search trees can have bad behavior for some insertion sequences. • Average case O(log N), worst case O(N). • AVL trees maintain a balance invariant to prevent this bad behavior. • Accomplished via rotations during insert. • Splay trees achieve amortized running time of O(log N). • Accomplished via rotations during find.

Amortized running time • The analysis that allows us to conclude that we spend the same (or less) amount of time over a sequence of operations is called amortized analysis. • If we say that the amortized running time of a sequence of operations is O(f(N)): • Some operations might be more than O(f(N)), other less. • But the average over the entire sequence is O(f(N)).

Splay trees • Splay trees provide a guarantee that any sequence of M operations (starting from an empty tree) will require O(Mlog N) time. • Hence, each operation has amortized cost of O(log N). • It is possible that a single operation requires O(N) time. • But there are no bad sequences of operations on a splay tree.

A basic observation • Let’s suppose that a node requires O(N) time to find. • If and when this happens, we must move it somewhere closer to the root so that if we access it again, it will definitely require less than O(N) time. • If we don’t do this, then O(log N) amortized behavior is not possible.

Danny Sleator • The inventor of splay trees. • Winner of Kannelakis award. • See his splay tree demo at http://www.cs.cmu.edu/~sleator • (And definitely a procrastinator.)

The basic idea • Every time a node is accessed, move it to the root. • The move can be accomplished by performing AVL rotations. • Practical benefits: • In practice, nodes are often accessed multiple times. • AVL rotations make trees more balanced.

Using rotations • Suppose we perform a find operation, which accesses node n. • Then perform AVL rotations, starting from n, to move it up to the root. • Doing this requires O(d) time, where d is the depth of n. • But a subsequent access of n will require only O(1) time, and the tree will be better balanced, too.

A simple example 0 move 5 to root \ 1 \ 2 \ 3 \ 4 \ 5

Example Ctd.. This is the "move-to-root heuristic".

Example 2 (move 2 to root) 6 / \ 1 7 / \ 0 3 / \ 2 5 / 4

Work sheet

Move to Root • Move-to-root() is an example of a "self-adjusting" heuristic • Every time we search for a node x, we do move-to-root(x) • Works well if we search for few different nodes • Idea is to try to maintain log(n) performance over a sequence of operations.

Important Concept - Amortized Time • Question: Is it possible that by applying move-to-root() on every search gives amortized time bounded by O(log n) for any access pattern? • Answer:

Bottom Up Splaying

Basic Cases Zig-Zag: z z x \ \ / \ y ===> x ===> z y / \ x y

Basic Cases Zig-Zig: z y x / / \ \ y ===>x z ===> y / \ x z

Basic Cases • Zig y x / ===> \ x y

Example 6 / \ 1 7 / \ 0 2 \ 3 \ 5 / 4

Work area

Splaying, case 1 • There are four cases to consider. • Case 1: The node is already the root. • Nothing to do.

X Y Z Splaying, case 2 • Case 2: Accessed node’s parent is the root. • Perform a single rotation. Z X Y

Splaying, cases 3 and 4 • The next two cases cover the situation in which the accessed node has a grandparent.

a b b a Z X X Y1 Y2 Z Y1 Y2 Splaying, case 3 • Case 3: Zig-zag (left). • Perform an AVL double rotation.

Splaying, case 4 • Case 4: Zig-zig (left). • Special rotation. a b Z W b Y a X X W Y Z

Symmetry • And there are symmetric cases for zig-zag and zig-zig to the right.

0 0 1 1 2 2 3 3 4 6 5 5 6 4 zig-zig right Splay tree example Insert {0, 1, 2, 3, 4, 5, 6}, then find 6

0 0 1 1 2 6 3 3 6 2 5 5 4 4 zig-zag right Splay tree example, cont’d

0 6 1 1 6 0 3 3 2 5 2 5 4 4 zig-zag right Splaying example, cont’d

6 1 0 3 2 5 4 Result of splaying • Access of 6 required N nodes visited and modified. • But now accessing 5 requires only N/2 nodes visited and modified. • Will also bring all nodes up to N/4 of root. (Try it!) • And all nodes are shallower. • Every access will tend to improve the tree for future operations.

Operations on splay trees • Search – find the element and splay the resulting node • Insertion – insert the element and splay the node just inserted

Delete Operation Deletion Splay the node we want to delete(say x) to the root x A B • find the rightmost node of A (say y) and splay y to the root • Attach the parts

Splaying summary • Splaying has the effect of moving the accessed node to the root. • It also reduces the depth of almost all of the nodes along the access path.

Properties of Splaying • Theorem: A sequence of M splay operations on a tree of N nodes takes time O(M log N). (Assuming M > N) • Proof: beyond the scope of this course. • Theorem: sequentially splaying all the nodes in the tree takes O(N) time. • Proof: beyond the scope of this course.

Analysis of splay trees • The analysis of the running time of splay trees is quite difficult. • Any single find or insert might take O(N) time. • But any sequence of M operations, starting from an empty tree, will take only O(Mlog N) time. • In practice, splay trees work extremely well. • http://www.cs.technion.ac.il/~itai/ds2/framesplay/splay.html

BST’s vs Hash Tables • Question: Hash tables have O(1) performance for lookups, inserts and deletes, what is the use of search trees that can be O(log N) at best? • Answer:

Some nice things about BST’s • Find the maximum or minimum element • Find the successor (predecessor) of a given element in the set. • Searching and computing the "rank" in that ordering. • The RANK of an element in an ordered list is the number of elements before it in the list. • Splitting and joining (example from perl) • Prefix matching

Thursday • Dynamic Programming • Start HW2

15-211 Fundamental Data Structures and Algorithms