470 likes | 488 Views
Dive into the world of data structures to learn about arrays, linked lists, binary search trees, heaps, and hashing. Understand the operations, complexities, and when to choose the best structure. No absolute answers, just insights to help you decide!
E N D
Advanced Data Structure Hackson Leung 2009-01-10
Outline • Review • General Array • Linked List • Sorted Array • Main dish • Binary Search Tree • Heap (aka Priority Queue) • Hash Table • Dessert?
Review Something you should have known
Review • Data Structure – way to store data • GOOD or BAD? • No absolute answer • Operations Supported • Find an element T inside • Find the minimum element inside • Insert a new element T in it • Remove the Tth element inside • Remove the minimum element inside
Review • GOOD or BAD? • No absolute answer • WHY? • Bringing small money is troublesome • Most shops accept money • Octopus is convenient in payment • Not many shops accept it • Money OR Octopus? Depends • Which structure is the best? • Also Depends!
Review • GOOD or BAD? • No absolute answer IF we consider many • Definite IF only consider one operation • For one operation, time complexity • Still remember that? • O(1), O(lgN), O(N), O(NlgN), O(N2)...
Review - Array • If you don’t know that, you won’t be here • Find an element T • Search the whole array, O(N) • Find the minimum • Search the whole array, O(N) • Insert an element T • A[cnt] = T; cnt = cnt + 1; O(1) • Remove the Tth element • O(1). Why? • Remove the minimum element • O(N). Why?
Review – Linked List • Find an element T • Search the whole list, O(N) • Find the minimum • Search the whole list, O(N) • Insert an element T • O(1). Why? • Remove the element with pointer T • O(1), with better implementation. • Remove the minimum element • O(N). Why?
Review – Sorted Array • Find an element T • Binary search, O(lgN) • Find the minimum • Still need to search?!?! No way! O(1) • Insert an element T • Where to add? O(lgN) • Maintain sorted, O(N) shift • Remove the Tth element • Delete + Shift, O(N) • Remove the minimum element • Delete + Shift, O(N) (Can it be O(1)???)
Review – Summery • If we perform a few number of operations, fine • If we perform a lot...... • NONE ARE EFFICIENT!
Binary Search Tree I suppose you have learnt tree before
Binary Search Tree • Make use of Binary Tree • Binary Tree? At most 2 children! • One more property • Left Subtree < Node < Right Subtree 11 8 15 4 9 20
Binary Search Tree • Insert 11, 8, 15, 9, 20, 4 11 8 15 4 9 20
Binary Search Tree • Insert 11, 8, 15, 9, 20, 4 11 8 15 4 9 20
Binary Search Tree • Find 9 11 8 15 4 9 20
Binary Search Tree • Find 14 11 8 15 4 9 20
Binary Search Tree • Remove • Leaf Node Very easy • Single Child Still easy • Push that child upward • 2 Children? • Find Max(Min) at Left(Right) Subtree • Replace to that node • For that removal of node... • Leaf? • Single? • 2 Children?
Binary Search Tree • Remove • Leaf Node Very easy • Single Child Still easy • Push that child upward • 2 Children? • Replace by Min(Max) at Left(Right) Subtree • Either leaf of single child case • Declare that node to be deleted instead of actually deleting (Lazy Deletion)
Binary Search Tree • Remove • Declare that node to be deleted instead of actually deleting (Lazy Deletion) • Delete 11 11 8 15 4 9 20
Binary Search Tree • Remove • Declare that node to be deleted instead of actually deleting (Lazy Deletion) • Delete 11 8 15 4 9 20
Binary Search Tree • Summary • Insert is similar to Find • Find Minimum: Left Most • Remove Min: Find Min + Remove • Complexities • Find: O(lgN) • Find Min: O(lgN) • Remove Min: O(lgN) • Insert: O(lgN) • Remove: O(lgN)
Binary Search Tree • World of lgN! • It is supposed to be lgN only. • Upper bounds...? • Example...? • Reason...? • Solution • AVL Tree, Red Black Tree • By using rotations • Rarely used + difficult to implement
Heap Priority is given to...
Heap (Priority Queue) • Binary Tree (Usually complete) • Queue only supports • Enqueue (Insert) • Dequeue (Remove Min) • One property • Node value is greater(smaller) than all its descendants • Max(Min) Heap
Heap (Priority Queue) • How to implement? Array! • Suppose 1-based array • For an node at Tth position • Parent = T/2 • Left = 2T • Right = 2T+1
Heap (Priority Queue) • Enqueue • Just insert at the end • Shift up until property is satisfied
Heap (Priority Queue) • Dequeue • Replace top element with the last • Shift down until property is satisfied
Heap (Priority Queue) • Build a heap • Each node’s height: lgN • For each node, do shift down • O(NlgN)??? • For a binary tree with N nodes • There are at most ceil(N/2h+1) nodes with height h • Complexity in terms of h • = O(2N) • Therefore it is O(N)
Heap (Priority Queue) • Summary • Find is not supported • O(N) search whole array • Is remove supported? • Any subtree is also a heap • Remove Min on that subtree • Complexity • Find: O(N) • Find Min: O(1) • Remove Min: O(lgN) • Insert: O(lgN) • Remove: O(lgN)
Hash Table Memory is finite, but input range is NOT
Hash Table • Simple Problem • Mark Six result: 6 integers in [1..49] • Check if each number matches my bet • Most efficient approach? • Answer • Use boolean array with 49 cells • Check if that cell is set for each number • O(1)
Hash Table • Simple Problem...? • Mark Six result: 6 integers in [1..232] • Check if each number matches my bet • Most efficient approach? • Answer...? • Use boolean array with 232 cells • 4GB memory already • Even O(1) checking... Too much memory required
Hash Table • Suppose we are not checking numbers but strings instead... • Suppose we are not checking numbers but images instead... • Solution • Compress the range by Hash Function • Convert the range into an integer
Hash Table • Idea • We need to store values between 0 to 99, but we have only 10 cells • We can compress the range [0, 99] to [0, 9] by taking the modulo 10. It is called Hash Value • Insert, Find and Deleting are O(1)
Hash Table • Problem • What if inserting both 11 and 31? • Collision problem • Solutions • Chaining (Open Hashing) • Open Addressing (Close Hashing)
Hash Table - Chaining • Each cell becomes a linked list • On average, Insert/Find/Remove takes O(1+α), where α is the Load Factor = # stored / # cells • For random Hash Function, usually average case
Hash Table – Open Addressing • Find until a blank cell is reached • Remove must use Lazy Deletion or Find may fail in some cases
Hash Table – Open Addressing • Previous method is called Linear Probing • Quadratic Probing is also popular • Table size MUST be prime • In OI, Linear Probing is enough • Complexities • Find: O(1 / 1-α) • Insert: O(1 / 1-α) • Remove: O(ln(1 / 1- α)/α + 1/α)
Hash Table – Summary • Find Min and Remove Min is usually not supported unless you search the whole tree or array • Complexities (Chaining) • Find: O(1 + α) • Insert: O(1 + α) • Remove: O(1 + α) • Complexities (Open Addressing) • Find: O(1 / 1-α) • Insert: O(1 / 1-α) • Remove: O(ln(1 / 1- α)/α + 1/α) • If α is small (<50%), almost O(1)
Summary Compare what we have learnt
Summary • Choice of data structures are depending on the problem nature • What operations are intensive is the critical factor of choosing them
Extras Read only if you have digested main dishes
Extra Topics • Double Hashing • Use of two hash functions • Binary Indexed Tree • IOI 2001 Mobiles • Segment Tree • IOI 1998 Picture • Quad Tree • Rarely used
Tasks Practise makes Perfect
Tasks • HKOJ 1020 – Left Join • HKOJ 1021 – Inner Join • HKOJ 1019 – Addition II • HKOJ 1090 – Dilligent • HKOJ 3061 – Tappy World • NOI 2004 Day 1 - Cashier
Q & A Curiosity?