490 likes | 604 Views
I/O-efficient Algorithms and Data Structures. Lars Arge Professor and center director Aarhus University ALMADA summer school August 5, 2013. I/O-Model. Parameters N = # elements in problem instance B = # elements that fits in disk block M = # elements that fits in main memory
E N D
I/O-efficient Algorithms and Data Structures Lars Arge Professor and center director Aarhus University ALMADA summer school August 5, 2013
I/O-efficient Algorithms and Data Structures I/O-Model • Parameters N = # elements in problem instance B = # elements that fits in disk block M = # elements that fits in main memory T = # output size in searching problem • We often assume that M>B2 • I/O: Movement of block between memory and disk D Block I/O M P
I/O-efficient Algorithms and Data Structures Fundamental Bounds Internal External • Scanning: N • Sorting: N log N • Permuting • Searching: • Note: • Linear I/O: O(N/B) • Permuting not linear • Permuting and sorting bounds are equal in all practical cases • B factor VERY important: • Cannot sort optimally with search tree
I/O-efficient Algorithms and Data Structures External Sort • Merge sort: • Create N/M memory sized sorted runs • Merge runs together M/B at a time phases using I/Os each • Distribution sort similar (but harder – split elements)
PermutingN elements according to a given permutation takes I/Osin “indivisibility” model • SortingN elements takes I/Os in comparison model • Proved using counting arguments I/O-efficient Algorithms and Data Structures Lower bounds
Analyze in two-level model Efficient on onelevel, efficient on alllevels (of fully associative hierarchy using LRU) I/O-efficient Algorithms and Data Structures Cache-Oblivious Model • N, B, and M as in I/O-model • Assumed that M>B2(tall cache assumption) • M and B not used in algorithm description • Block transfers by optimal paging strategy
4 1 2 3 I/O-efficient Algorithms and Data Structures Cache-Oblivious Distribution Sort • Break into subarrays of size • Recursively sort each subarray • Distribute into buckets of size ~ • Recursively sort each bucket • Distribution performed in I/Os =
I/O-efficient Algorithms and Data Structures Sorting summary • External merge or distribution sort takes I/Os • Merge-sort based on M/B-way merging • Distribution sort based on -way distribution and (complicated) split elements finding • Also cache-oblivious using -way distribution • Optimal in comparison model • lower bound in stronger indivisibility model • Holds even for permuting
Combination of distribution and plane sweep • Used to solve batched geometric problems • General idea: • Divide plane into M/Bslabs with O(N/(M/B)) endpoints each • Sweep plane top-down while finding solutions involving objects in different slabs • Distribute data to M/B slabs recourse in each slab • Examples: • Orthogonal line segment intersection • Rectangle intersection • Homework I/O-efficient Algorithms and Data Structures Distribution sweeping
Thursday: • Sorting upper bound (merge- and distribution-sort) • Sorting (permuting) lower bound • Cache-oblivious sorting • Batched geometric problems: Distribution sweeping • Today: • Searching (B-trees) • Batched searching (Buffer-trees) • I/O-efficient priority queues • I/O-efficient list ranking I/O-efficient Algorithms and Data Structures Outline
I/O-efficient Algorithms and Data Structures External Search Trees • Binary search tree: • Standard method for search among N elements • We assume elements in leaves • Search traces at least one root-leaf path • If nodes stored arbitrarily on disk • Search in I/Os • Rangesearch in I/Os
I/O-efficient Algorithms and Data Structures External Search Trees • BFS blocking: • Block height • Output elements blocked • • Rangesearch in I/Os • Optimal: O(N/B) space and query
I/O-efficient Algorithms and Data Structures Cache-Oblivious Search Tree • “Van Emde Boas” recursive layout: A B2 B B1 Recursive memory layout: B A B1 B2
B B B B • Search path “hits” blocks I/O-efficient Algorithms and Data Structures Cache-Oblivious Search Tree • Search analysis: • Conceptually stop recursion when recursive subtrees have size and each recursive subtree fits in a block
I/O-efficient Algorithms and Data Structures Dynamic External Search Trees? • Maintaining BFS blocking during updates? • Balance normally maintained in search trees using rotations • Seems very difficult to maintain BFS blocking during rotation • Also need to make sure output (leaves) is blocked! x y y x
I/O-efficient Algorithms and Data Structures B-trees • BFS-blocking naturally corresponds to tree with fan-out • B-trees balanced by allowing node degree to vary • Rebalancing performed by splitting and merging nodes
I/O-efficient Algorithms and Data Structures (a,b)-tree • T is an (a,b)-tree (a≥2 and b≥2a-1) • All leaves on the same level and contain between a and b elements • Except for the root, all nodes have degree between a and b • Root has degree between 2 and b (2,4)-tree • (a,b)-tree uses linear space and has height • • Choosing a,b = each node/leaf stored in one disk block • • O(N/B) space and query
I/O-efficient Algorithms and Data Structures (a,b)-Tree Insert • Insert: Search and insert element in leaf v DO v has b+1 elements/children Splitv: make nodes v’ and v’’ with and elements insert element (ref) in parent(v) (make new root if necessary) v=parent(v) • Insert touch nodes v v’ v’’
I/O-efficient Algorithms and Data Structures (2,4)-Tree Insert
I/O-efficient Algorithms and Data Structures (a,b)-Tree Delete • Delete: Search and delete element from leaf v DO v has a-1 elements/children Fusev with sibling v’: move children of v’ to v delete element (ref) from parent(v) (delete root if necessary) If v has >b (and ≤ a+b-1<2b) children split v v=parent(v) • Delete touch nodes v v Lars Arge
I/O-efficient Algorithms and Data Structures (2,4)-Tree Delete
I/O-efficient Algorithms and Data Structures (a,b)-Tree (2,3)-tree • (a,b)-tree properties: • If b=2a-1 every update can cause many rebalancing operations • If b≥2a update only cause O(1) rebalancing operations amortized • If b>2a only rebalancing operations amortized • Both somewhat hard to show • If b=4a easy to show that update causes rebalance operations amortized • After split during insert a leaf contains 4a/2=2a elements • After fuse during delete a leaf contains between 2a and 5a elements (split if more than 3a between 3/2a and 5/2a) insert delete
I/O-efficient Algorithms and Data Structures B-tree Summary • B-trees: (a,b)-trees with a,b = • O(N/B) space • O(logB N+T/B) query • O(logB N) update • B-trees with elements in the leaves sometimes called B+-tree • Construction in I/Os • Sort elements and construct leaves • Build tree level-by-level bottom-up
I/O-efficient Algorithms and Data Structures Generalized B-tree • B-tree with branching parameter band leaf parameter k(b,k≥8) • All leaves on same level and contain between 1/4k and k elements • Except for the root, all nodes have degree between 1/4b and b • Root has degree between 2 and b • B-tree with leaf parameter • O(N/B) space • Height • amortized leaf rebalance operations • amortized internal node rebalance operations
I/O-efficient Algorithms and Data Structures B-tree Sorting • Normally we can sortN elements in O(N log N) time with search tree • Insert all elements one-by-one (construct tree) • Output in sorted order using in-order traversal • Same algorithm using B-tree use I/Os • A factor of non-optimal • We would like to have dynamic data structure to use in algorithms I/O operations
Melements fan-out M/B B B • Main idea: Logically group nodes together and add buffers • Insertions done in a “lazy” way – elements inserted in buffers • When a buffer runs full elements are pushed one level down • Buffer-emptying in O(M/B) I/Os • every blocktouched constant number of times on each level • inserting N elements (N/B blocks) costs I/Os I/O-efficient Algorithms and Data Structures Buffer-tree Technique
M $m$ blocks B I/O-efficient Algorithms and Data Structures Basic Buffer-tree • Definition: • B-tree with branching parameter and leaf parameter B • Size M buffer in each internal node • Updates: • Add time-stamp to insert/delete element • Collect B elements in memory before inserting in root buffer • Perform buffer-emptying when buffer runs full
I/O-efficient Algorithms and Data Structures Basic Buffer-tree • Note: • Buffer can be larger than M during recursive buffer-emptying • Elements distributed in sorted order at most M elements in buffer unsorted • Rebalancing needed when “leaf-node” buffer emptied • Leaf-node buffer-emptying only performed after all full internal node buffers are emptied M $m$ blocks B
I/O-efficient Algorithms and Data Structures Basic Buffer-tree • Internal node buffer-empty: • Load first M (unsorted) elements into memory and sort them • Merge elements in memory with rest of (already sorted) elements • Scan through sorted list while • Removing “matching” insert/deletes • Distribute elements to child buffers • Recursively empty full child buffers • Emptying buffer of size X takes O(X/B+M/B)=O(X/B) I/Os M $m$ blocks
I/O-efficient Algorithms and Data Structures Basic Buffer-tree • Buffer-emptyof leaf node with K elements in leaves • Sort buffer as previously • Merge buffer elements with elements in leaves • Remove “matching” insert/deletes obtaining K’ elements • If K’<K then • Add K-K’ “dummy” elements and insert in “dummy” leaves Otherwise • Place K elements in leaves • Repeatedly insert block of elements in leaves and rebalance • Delete dummy leaves and rebalance when all full buffers emptied K
v v v’ I/O-efficient Algorithms and Data Structures Basic Buffer-tree • Invariant: Buffers of nodes on path from root to emptied leaf-node are empty • Insert rebalancing (splits) performed as in normal B-tree • Delete rebalancing: v’ buffer emptied before fuse of v • Necessary buffer emptyings performed before next dummy-block delete • Invariant maintained v v’ v’’
I/O-efficient Algorithms and Data Structures Basic Buffer-tree • Analysis: • Not counting rebalancing, a buffer-emptying of node with X ≥ M elements (full) takes O(X/B) I/Os total full node emptying cost I/Os • Delete rebalancing buffer-emptying (non-full) takes O(M/B) I/Os cost of one split/fuse O(M/B) I/Os • During N updates • O(N/B) leaf split/fuse • internal node split/fuse Total cost of N operations: I/Os
M $m$ blocks B I/O-efficient Algorithms and Data Structures Basic Buffer-tree • Emptying all buffers after N insertions: Perform buffer-emptying on all nodes in BFS-order resulting full-buffer emptyings cost I/Os empty non-full buffers using O(M/B) O(N/B) I/Os • N elements can be sorted using buffer tree in I/Os
$m$ blocks I/O-efficient Algorithms and Data Structures Buffer-tree Summary • Batching of operations on B-tree using M-sized buffers • I/O updates amortized • All buffers emptied in I/Os • One-dim. rangesearch operations can also be supported in I/Os amortized • Search elements handle lazily like updates • All elements in relevant sub-trees reported during buffer-emptying • Buffer-emptying in O(X/B+T’/B), where T’ is reported elements
I/O-efficient Algorithms and Data Structures Orthogonal Line Segment Intersection • Sweep plane top-down while maintaining search tree T on vertical segments crossing sweep line (by x-coordinates) • Top endpoint of vertical segment: Insert in T • Bottom endpoint of vertical segment: Delete from T • Horizontal segment: Perform range query with x-interval on T I/Os using buffer-tree
I/O-efficient Algorithms and Data Structures Buffered Priority Queue • Basic buffer tree can be used in external priority queue • To delete minimal element: • Empty all buffers on leftmost path • Delete elements in leftmost leaf and keep in memory • Deletion of next M minimal elements free • Inserted elements checked against minimal elements in memory • I/Os every O(M) delete amortized B
3 4 5 9 8 7 10 2 6 I/O-efficient Algorithms and Data Structures List Ranking • Problem: • Given N-vertex linked list stored in array • Compute rank (number in list) of each vertex • One of the simplest graph problem one can think of • Straightforward O(N) internal algorithm • Also uses O(N) I/Os in external memory • Much harder to get external algorithm 1 5 2 6 3 8 9 4 10 7
I/O-efficient Algorithms and Data Structures List Ranking • We will solve more general problem: • Given N-vertex linked list with edge-weights stored in array • Compute sum of weights (rank) from start for each vertex • List ranking: All edge weights one • Note: Weight stored in array entry together with edge (next vertex) 1 1 1 1 1 1 1 1 1 5 2 6 3 8 9 4 10 7 1 1
2 2 2 1 1 1 1 1 1 1 2 3 4 5 6 7 8 9 10 I/O-efficient Algorithms and Data Structures List Ranking • Algorithm: • Find and mark independent set of vertices • “Bridge-out” independent set: Add new edges • Recursively rank resulting list • “Bridge-in” independent set: Compute rank of independent set • Step 1, 2 and 4 in I/Os • Independent set of size αN for 0 < α≤ 1 I/Os 1 1 1 1
3 4 8 9 7 10 2 6 8 3 4 5 9 7 10 2 6 8 3 4 5 9 7 10 2 6 I/O-efficient Algorithms and Data Structures List Ranking: Bridge-out/in • Obtain information (edge or rang) of successor • Make copy of original list • Sort original list by successor id • Scan original and copy together to obtain successor information • Sort modified original list by id I/Os 2 10 4 6 5 9 8 2 7 3 1 1
3 4 5 9 8 7 10 2 6 I/O-efficient Algorithms and Data Structures List Ranking: Independent Set • Easy to design randomized algorithm: • Scan list and flip a coin for each vertex • Independent set is vertices with head and successor with tails Independent set of expected size N/4 • Deterministic algorithm: • 3-color vertices (no vertex same color as predecessor/successor) • Independent set is vertices with most popular color Independent set of size at least N/3 • 3-coloring I/O algorithm
I/O-efficient Algorithms and Data Structures List Ranking: 3-coloring • Algorithm: • Consider forward and backward lists (heads/tails in two lists) • Color forward lists (except tail) alternately red and blue • Color backward lists (except tail) alternately green and blue 3-coloring 3 4 5 9 8 7 10 2 6
3 4 5 9 8 7 10 2 6 I/O-efficient Algorithms and Data Structures List Ranking: Forward List Coloring • Identify heads and tails • For each head, insert red element in priority-queue (priority=position) • Repeatedly: • Extract minimal element from queue • Access and color corresponding element in list • Insert opposite color element corresponding to successor in queue • Scan of list • O(N) priority-queue operations I/Os `
3 4 5 9 8 7 10 2 6 I/O-efficient Algorithms and Data Structures List Ranking Summary • Simplest graph problem: Traverse linked list • Very easy O(N) algorithm in internal memory • Much more difficult external memory • Finding independent set via 3-coloring • Bridging vertices in/out • Permuting bound best possible • Also true for other graph problems 1 5 2 6 3 8 9 4 10 7
3 4 5 9 8 7 10 2 6 I/O-efficient Algorithms and Data Structures List Ranking Summary • External list ranking algorithm similar to PRAM algorithm • Sometimes external algorithms by “PRAM algorithm simulation” • Forward list coloring algorithm example of “time forward processing” • Use external priority-queue to send information “forward in time” to vertices to be processed later • PRAM algorithm ideas and time forward processing used in many graph algorithms
External Memory Geometric Data Structures (section 1-5) Lecture notes by L. Arge • I/O-Efficient Graph Algorithms (section 1-2) Lecture notes by N. Zeh • Cache-Oblivious Algorithms. M. Frigo, C. Leiserson, H. Prokop, and S. Ramachandran. Proc. FOCS '99. I/O-efficient Algorithms and Data Structures References
We discussed • Sorting • Hardness of permuting (sorting) • Batched geometric problems • Searching: B-trees, buffer-trees, priority-queue • Graph algorithms: List-ranking • Cache-oblivious algorithms • Important techniques • Multiway merging/distribution • Distribution sweeping • Multiway-trees, buffered data structures • Time-forward processing(and PRAM simulation) I/O-efficient Algorithms and Data Structures Summary
Thanks large@cs.au.dk www.madalgo.au.dk