1 / 39

I/O-efficient Algorithms and Data Structures

I/O-efficient Algorithms and Data Structures. Lars Arge Professor and center director Aarhus University ALMADA summer school August 1, 2013. Massive Data. Pervasive use of computers and sensors Increased ability to acquire/store/process data → Massive data collected everywhere

oke
Download Presentation

I/O-efficient Algorithms and Data Structures

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. I/O-efficient Algorithms and Data Structures Lars Arge Professor and center director Aarhus University ALMADA summer school August 1, 2013

  2. I/O-efficient Algorithms and Data Structures Massive Data • Pervasive use of computers and sensors • Increased ability to acquire/store/process data → Massive data collected everywhere • Society increasingly “data driven” → Access/process data anywhere any time Nature/Science special issues • 2/06, 9/08, 2/11 • Scientific data size growing exponentially, while quality and availability improving • Paradigm shift: Science will be about mining data • Obviously not only in sciences: • Economist 02/10: • From 150 Billion Gigabytes five years ago • to 1200 Billion today • Managing data deluge difficult; doing so • will transform business/public life

  3. I/O-efficient Algorithms and Data Structures Random Access Machine Model • Standard theoretical model of computation: • Infinite memory • Uniform access cost • Simple model crucial for success of computer industry R A M

  4. R A M L 1 L 2 I/O-efficient Algorithms and Data Structures Hierarchical Memory • Modern machines have complicated memory hierarchy • Levels get larger and slower further away from CPU • Data moved between levels using large blocks

  5. read/write head read/write arm track magnetic surface I/O-efficient Algorithms and Data Structures Slow I/O • Disk access is 106 times slower than main memory access “The difference in speed between modern CPU and disk technologies is analogous to the difference in speed in sharpening a pencil using a sharpener on one’s desk or by taking an airplane to the other side of the world and using a sharpener on someone else’s desk.” (D. Comer) • Disk systems try to amortize large access time transferring large contiguous blocks of data • Important to store/access data to take advantage of blocks (locality)

  6. running time data size I/O-efficient Algorithms and Data Structures Scalability Problems • Most programs developed in RAM-model • Run on large datasets because OS moves blocks as needed • Moderns OS utilizes sophisticated paging and prefetching strategies • But if program makes scattered accesses even good OS cannot take advantage of block access  Scalability problems!

  7. I/O-efficient Algorithms and Data Structures External Memory Model N = # of items in the problem instance B = # of items per disk block M = # of items that fit in main memory T = # of items in output I/O: Move block between memory and disk We assume (for convenience) that M >B2 D Block I/O M P

  8. I/O-efficient Algorithms and Data Structures Fundamental Bounds Internal External • Scanning: N • Sorting: N log N • Permuting • Searching: • Note: • Linear I/O: O(N/B) • Permuting not linear • Permuting and sorting bounds are equal in all practical cases • B factor VERY important: • Cannot sort optimally with search tree

  9. I/O-efficient Algorithms and Data Structures Scalability Problems: Block Access Matters • Example: Traversing linked list (List ranking) • Array size N = 10 elements • Disk block size B = 2 elements • Main memory size M = 4 elements (2 blocks) • Large difference between N and N/B large since block size is large • Example: N = 256 x 106, B = 8000 , 1ms disk access time N I/Os take 256 x 103 sec = 4266 min = 71 hr N/B I/Os take 256/8 sec = 32 sec 1 5 2 6 3 8 9 4 10 1 2 10 9 5 6 3 4 7 7 8 Algorithm 1: N=10 I/Os Algorithm 2: N/B=5 I/Os

  10. I/O-efficient Algorithms and Data Structures Outline • Today: • Sorting upper bound (merge- and distribution-sort) • Sorting (permuting) lower bound • Cache-oblivious sorting (if time permits) • Batched geometric problems: Distribution sweeping • Monday: • Searching (B-trees) • Batched searching (Buffer-trees) • I/O-efficient priority queues • I/O-efficient list ranking

  11. I/O-efficient Algorithms and Data Structures Queues and Stacks • Queue: • Maintain push and pop blocks in main memory  O(1/B) Push/Pop operations amortized • Stack: • Maintain push/pop blocks in main memory  O(1/B) Push/Pop operations amortized Push Pop

  12. I/O-efficient Algorithms and Data Structures Sorting • <M/B sorted lists (queues) can be merged in O(N/B) I/Os M/B blocks in main memory • Unsorted list (queue) can be distributed using <M/B split elements in O(N/B) I/Os

  13. I/O-efficient Algorithms and Data Structures Sorting • Merge sort: • Create N/M memory sized sorted lists • Repeatedly merge lists together Θ(M/B) at a time  phases using I/Os each  I/Os

  14. I/O-efficient Algorithms and Data Structures Sorting • Distribution sort (multiway quicksort): • Compute Θ(M/B) split elements • Distribute unsorted list into Θ(M/B) unsorted lists of equal size • Recursively split lists until fit in memory  I/Os if split elements in O(N/B) I/Os Can compute split elements in O(N/B) I/Os  phases  I/Os

  15. I/O-efficient Algorithms and Data Structures Permuting Lower Bound Permuting N elements according to a given permutation takes I/Os in “indivisibility” model • Indivisibility model: Move of elements only allowed operation • Note: • We can allow copies (and destruction of elements) • Bound also a lower bound on sorting • Proof: • View memory and disk as array of N tracks of B elements • Assume all I/Os track aligned (assumption can be removed) D M

  16. I/O-efficient Algorithms and Data Structures Permuting Lower Bound • Array contains permutation of N elements at all times • We will count how many permutations can be reached (produced) with t I/Os • Input: • Choose track: N possibilities • Rearrange ≤ B element in track and place among ≤ M-B elements in memory: • possibilities if “fresh” track • otherwise  at most permutations after t inputs • Output: Choose track: N possibilities D M

  17. I/O-efficient Algorithms and Data Structures Permuting Lower Bound • Permutation algorithm needs to be able to produce N! permutations (using Stirlings formula and ) • If we have • If we have and thus    

  18. I/O-efficient Algorithms and Data Structures Sorting lower bound • Similar argument but assuming comparison model in internal memory • Initially N! possible orderings • Count how may possible after t I/Os  Sorting N elements takes I/Os in comparison model   

  19. I/O-efficient Algorithms and Data Structures Summary/Conclusion: Sorting • External merge or distribution sort takes I/Os • Merge-sort based on M/B-way merging • Distribution sort based on -way distribution and (complicated) split elements finding • Key is linear I/O M/B-way merging/distribution • Optimal in comparison model • lower bound in stronger indivisibility model • Holds even for permuting

  20. I/O-efficient Algorithms and Data Structures Outline • Today: • Sorting upper bound (merge- and distribution-sort) • Sorting (permuting) lower bound • Cache-oblivious sorting (if time permits) • Batched geometric problems: Distribution sweeping • Monday: • Searching (B-trees) • Batched searching (Buffer-trees) • I/O-efficient priority queues • I/O-efficient list ranking

  21. Analyze in two-level model  Efficient on onelevel, efficient on alllevels (of fully associative hierarchy using LRU) I/O-efficient Algorithms and Data Structures Cache-Oblivious Model • N, B, and M as in I/O-model • Assumed that M>B2 (tall cache assumption) • M and B not used in algorithm description • Block transfers by optimal paging strategy

  22. I/O-efficient Algorithms and Data Structures I/O-model Distribution Sort • levels • Distribution in I/Os  algorithms N

  23. Distribution performed in accesses after • breaking the N elements into subarrays of size • sorting each subarray recursively  Recurrence for number of accesses used to sort  I/O-efficient Algorithms and Data Structures Cache-Oblivious Distribution Sort • General idea: way distribution N

  24. 4 1 2 3 I/O-efficient Algorithms and Data Structures Cache-Oblivious Distribution Sort • Distribution in accesses (assuming buckets known): • Divide subarrays into two equal sized groups A1and A2 • Divide buckets into two equal-sized groups B1and B2 • Recursively: • Distribute (relevant) elements from A1into B1 • Distribute (relevant) elements from A2into B1 • Distribute elements from A1into B2 • Distribute elements from A2into B2

  25. I/O-efficient Algorithms and Data Structures Cache-Oblivious Distribution Sort • Analysis of distribution: • Consider recursive subproblems on Bsubarrays • such problems • Each subproblem solved in • accesses •  • accesses

  26. I/O-efficient Algorithms and Data Structures Batched Geometric Problems • Example: Orthogonal line segment intersection • Given set of axis-parallel line segments, report all intersections • In internal memory many problems is solved using sweeping

  27. I/O-efficient Algorithms and Data Structures Plane Sweeping • Sweep plane top-down while maintaining search tree T on vertical segments crossing sweep line (by x-coordinates) • Top endpoint of vertical segment: Insert in T • Bottom endpoint of vertical segment: Delete from T • Horizontal segment: Perform range query with x-interval on T

  28. I/O-efficient Algorithms and Data Structures Plane Sweeping • In internal memoryalgorithm runs in optimalO(NlogN+T) time • In external memory algorithm performs badly (>N I/Os) if |T|>M • Solution: Distribution sweeping • Combination of distribution and plane sweeping

  29. I/O-efficient Algorithms and Data Structures Distribution Sweeping • Divide plane into M/B-1slabs with O(N/(M/B)) endpoints each • Sweep plane top-down while reporting intersections between • part of horizontal segment spanning slab(s) and vertical segments • Distribute data to M/B-1 slabs • vertical segments and non-spanning parts of horizontal segments • Recourse in each slab

  30. I/O-efficient Algorithms and Data Structures Distribution Sweeping • Sweep performed in O(N/B+T’/B) I/Os  I/Os • Maintain active list of vertical segments for each slab (<B in memory) • Top endpoint of vertical segment: Insert in active list • Horizontal segment: Scan through all relevant active lists • Removing “expired” vertical segments • Reporting intersections with “non-expired” vertical segments

  31. I/O-efficient Algorithms and Data Structures Distribution Sweeping • Other example: Rectangle intersection • Given set of axis-parallel rectangles, report all intersections.

  32. I/O-efficient Algorithms and Data Structures Distribution Sweeping • Divide plane into M/B-1slabs with O(N/(M/B)) endpoints each • Sweep plane top-down while reporting intersections between • part of rectangles spanning slab(s) and other rectangles • Distribute data to M/B-1 slabs • Non-spanning parts of rectangles • Recurse in each slab

  33. I/O-efficient Algorithms and Data Structures Distribution Sweeping • Seems hard to perform sweep in O(N/B+T’/B) I/Os • Solution: Multislabs(contigious set of slabs) • Reduce fanout of distribution to • Recursion height still • Room for block from each multislab (activlist) in memory

  34. I/O-efficient Algorithms and Data Structures Distribution Sweeping • Sweep while maintaining rectangle active list for each multisslab • Top side of spanning rectangle: Insert in active multislab list • Each rectangle: Scan through all relevant multislab lists • Removing “expired” rectangles • Reporting intersections with “non-expired” rectangles  I/Os

  35. I/O-efficient Algorithms and Data Structures Homework Design an I/O algorithm that given N rectangles in the plane computes the measure (area) of the their union. Make sure to argue for correctness and I/O-complexity of your algorithm. Hint:First find for each input y-coordinate y the length of the intersection between the rectangles and a horizontal line through the y. Use distribution sweeping with a combine step to do so.

  36. I/O-efficient Algorithms and Data Structures Outline • Today: • Sorting upper bound (merge- and distribution-sort) • Sorting (permuting) lower bound • Cache-oblivious sorting (if time permits) • Batched geometric problems: Distribution sweeping • Monday: • Searching (B-trees) • Batched searching (Buffer-trees) • I/O-efficient priority queues • I/O-efficient list ranking

  37. I/O-efficient Algorithms and Data Structures References • Input/Output Complexity of Sorting and Related Problems A. Aggarwal and J.S. Vitter. CACM 31(9), 1988 • External-Memory Computational Geometry. M.T. Goodrich, J-J. Tsay, D.E. Vengroff, and J.S. Vitter. Proc. FOCS'93 • Cache-Oblivious Algorithms. M. Frigo, C. Leiserson, H. Prokop, and S. Ramachandran. Proc. FOCS '99.

  38. I/O-efficient Algorithms and Data Structures Center for Massive Data Algorithmics • High level objectives: • Advance algorithmic knowledge in “massive data” processing area • Train researchers in world-leading international environment • Be catalyst for multidisciplinary/industry collaboration • Building on: • Strong international team • Vibrant international environment • Focus areas (among others): • I/O-efficient algorithms • Streaming algorithms • Cache-oblivious algorithms • Algorithm engineering We are hiring!!

  39. Thanks large@cs.au.dk www.madalgo.au.dk

More Related