350 likes | 640 Views
13. External Sorting. Motivation 2-way External Sort: Memory, passes,cost General External Sort: Memory, passes, cost Optimizations Snowplow Double Buffering Forecasting Using a B+ tree index Bucket Sort Intergalactic Standard Reference Graefe, Implementing Sorting in Database Systems
E N D
13. External Sorting • Motivation • 2-way External Sort: Memory, passes,cost • General External Sort: Memory, passes, cost • Optimizations • Snowplow • Double Buffering • Forecasting • Using a B+ tree index • Bucket Sort • Intergalactic Standard Reference • Graefe, Implementing Sorting in Database Systems • http://portal.acm.org/citation.cfm?id=1132964
Learning Objectives • Derive formula for cost of external merge sort • Derive amount of memory needed to sort a file in 2 passes, using merge or bucket sort • Describe algorithm for generating longer initial runs and identify its best and worst cases • Describe forecasting and why it is useful • Identify when indexes should be used for sorting • Identify the pros and cons of external bucket vs merge sort.
Why sort? • A classic problem in computer science! • Exercises many software and hardware features • Data is often requested in sorted order • e.g., find students in increasing gpa order • Sorting is first step in bulk loading B+ tree index. • Sorting useful for some query processing algoritms (Chapter 14) • Problem: sort 1Gb of data with 1Mb of RAM. • why not virtual memory?
Sort algorithms? • If the data can fit in memory, which sort algorithm is best? • But most DBMS files will not fit in available memory • If the data is larger than memory, try the same alg. • Suppose • for this data, your sort alg. requires 220 random memory accesses • Memory access takes 1 microsec, disk takes 10 millisecs. • How much time is required to do your sort algorithm’s memory accesses? • If there is enough memory to hold the data? • If the data is four times the size of memory?
External Sorts • Definition: When data is larger than memory. • An aside: What is “Memory”? • Physical memory? The DBMS is not the only player • We concluded that most in-memory sort algs won’t be effective for external sorting. • What sort algorithms are best for external sort? • Sort-based • Hash-based
13. Sorting 2-Way External Merge Sort: Memory? • Pass 0: Read a page, sort it, write it. • How many buffer pages needed? • Pass 1, 2, 3, …, etc.: • How many buffer pages needed? INPUT 1 OUTPUT INPUT 2 Main memory buffers Result of Pass k+1 Result of Pass k
2-Way External Merge Sort: Passes? • Assume file is N pages • Run = sorted subfile • What happens in pass Zero? • How many runs are produced? • What is the cost in I/Os? • What happens in pass 1? • What happens in pass i? • How many passes are required? • What is the total cost?
13. Sorting Two-Way External Merge Sort: Cost 6,2 2 Input file 3,4 9,4 8,7 5,6 3,1 PASS 0 1,3 2 1-page runs 3,4 2,6 4,9 7,8 5,6 • Each pass we read + write each page in file. • N pages in the file => the number of passes • So total cost is: • Idea:Divide and conquer: sort subfiles and merge PASS 1 4,7 1,3 2,3 2-page runs 8,9 5,6 2 4,6 PASS 2 2,3 4,4 1,2 4-page runs 6,7 3,5 6 8,9 I/Os PASS 3 1,2 2,3 3,4 8-page runs 4,5 6,6 7,8 9
13. Sorting INPUT 1 . . . . . . INPUT 2 . . . OUTPUT INPUT B-1 Disk Disk B Main memory buffers General External Merge Sort • Suppose we have more than 3 buffer pages. • To sort a file with N pages using B buffer pages: • Pass 0: use B buffer pages. How many sorted runs of B pages each are produced? Cost? • Pass 1,2, …, etc.: merge B-1 runs. • How many runs are created after pass i? Cost of pass i? • How many passes? Total Cost?
13. Sorting Cost of External Merge Sort • Number of passes: • Cost = 2N * (# of passes) • E.g., with 5 buffer pages, to sort 108 page file • Number of passes is (1 + log4 108/5 ) = 4 • Cost is 2(108)*4 • Pass 0: Output is = 22 sorted runs of 5 pages each (last run is only 3 pages) • Pass 1: = 6 sorted runs of 20 pages each (last run is only 8 pages) • Pass 2: 2 sorted runs, 80 pages and 28 pages • Pass 3: Sorted file of 108 pages
How much memory is needed to sort a file in one or two passes? • N = number of data pages, B = memory pages available • To sort in One pass: N B • To Sort in Two passes: 1+logB-1N/B 2 • N/B (B-1)1 • Approximating B-1 by B, this yields • N B • For example, if pages are 4KBytes, a 4GByte file can be sorted in Two Passes with ? buffers.
Sorting in 2 passes: graphical proof Each run, 1xB pages • File is B pages wide • Each run is B pages • File is x pages high • Merge x runs in pass 1 • xB since we must merge x runs in B pages of memory • So N=xB BB or NB The File, N pages x B
Memory required for 2-pass sort Assuming page size of 4K
Can we always sort in 1 or 2 passes? • Assume only one query is active, and there is at least 1 gig of physical RAM. • Yes: DB2,Oracle, SQLServer, MySQL • They allocate all available memory to queries • Tricky to manage memory allocation as queries need more memory during execution • No: Postgres • Memory allocated to each query, for sort and other purposes, is fixed by a config parameter. • Sort memory is typically a few meg, in case there may be many queries executing.
Extremes of Sorting One Pass: N =B, 1-pass sort, cost = N N B N Original Data Sorted Data Quick-sort Two Pass: N = B2, 2-pass sort, cost = 3N B B N B 12..B B Sorted Data Original Data B Runs Merge Quicksort into B runs, length B
Extreme Problems • Most sorting of large data falls between these two extremes • If we apply the intergalactic standard sort-merge algorithm, in every textbook, the cost for any dataset with B<N<B2 will be 3M. • Must we always pay that large price? • Might there be an algorithm that is a hybrid of the two extremes?
Hybrid Sort when N 3B • The key idea of hybrid sort is don’t waste memory. • Here is an example of the hybrid sort algorithm when N is approximately 3B. One+ Pass: M 3B, 2-pass sort, cost = 3M – 2B B B N Sorted Data B B Original Data Runs Merge the runs on disk with the run in memory Quicksort into runs, length B, leaving the last run in memory
B B N Sorted Data 12..K B B 12..K Original Data -1 Merge the runs on disk with the run in memory Quicksort into runs, length B, leaving the last run in memory Runs B-k pages Hybrid Sort in general • Let k = N/B • Arrange R so the last run is B-k pages • Cost is N + 2(N – (B-k)) = 3N -2B + 2(N/B) = 3N – 2( B-N/B) • When N=B, cost is N+1. When N=B2, cost is 3N
13.3.1 Maximizing initial runs • Defn: making initial runs as long as possible. • Why is it helpful to maximize initial runs? • If initial run size is doubled, what is the time savings? • How can you maximize initial runs? • What algorithm is best?
13. Sorting Replacement Selection • Initialize empty priority queues CUR, NEXT • Read B buffer pages of data into CUR • Do • Pop record s with smallest key from CUR to current run • // key of s is now highest key in current run • If key of next input r >= key of s • //Can put in current run • insert r into CUR • else • insert r into NEXT • If CUR is empty • interchange NEXT and CUR and start a new run • Until (input is empty)
13. Sorting 2B More on Replacement Selection • Cf. Knuth, vol 3 [442], page 255. • Theorem: average length of a run in replacement sort is 2B • Worst-Case: • What is min length of a run? • How does this arise? • Best-Case: • What is max length of a run? • How does this arise? • Quicksort is faster, but ...
13. Sorting B How can we prove the Theorem? • Begin with some modeling assumptions • Data to be sorted are real numbers between 0 and 1 • Data appear at a uniform rate and distribution • A snowplow picks up one datum as one falls • Picking up a datum == pop( ) off the queue • Each datum is infinitesimal • Each run begins when the plow passes zero
13. Sorting B B 0 1 B B B 0 0 0 0 1 1 1 1 CURNEXT B 0 1 2B 2B
Snowplow: Conclusion • The figures on the previous page show that • At any time after run 0, the amount of snow = size of memory = B. • After the first run, the volume of snow removed in one circuit is 2B. • Cf. Larson and Graefe [471] • In spite of memory management problems, the snowplow optimization is very effective.
13. Sorting 13.4 I/O for External Merge Sort • What else can we do to improve performance? • We have assumed I/O is done a page at a time • Text suggests reading a blockof pages sequentially. • Pass 0: No problem • Pass 1,2,…: lowers fanin • Sometimes a win
INPUT 1 . . . . . . INPUT 2 . . . OUTPUT INPUT B-1 Disk Disk B Main memory buffers External Sort’s jerky behavior • Recall that each input is one page • What happens after the last record on a page is output?
13. Sorting Double Buffering • To reduce wait time for I/O request to complete, can prefetch into `shadow block’. • This could increase the number of passes • In practice, most files still sorted in 1-2 passes. INPUT 1 INPUT 1' INPUT 2 OUTPUT INPUT 2' OUTPUT' b block size Disk INPUT k Disk INPUT k' B main memory buffers, k-way merge
Forecasting A B C 3 5 14 34 1 33 45 55 4 6 7 9 • Cf. Knuth, vol 3, pg 324-7 • Double Buffering requires Doubling memory • What a huge waste! • Most shadow buffers lie idle, unused, wasted. • How can we forecast which shadow buffers will be needed first? • Forecasting can achieve performance of double buffering with little memory 88 91 93 99 50 65 74 83 56 57 58 59
13. Sorting Sorting Records! • Sorting has become a blood sport! • Parallel sorting is the name of the game ... • www.research.microsoft.com/barc/SortBenchmark
13. Sorting Using B+ Trees for Sorting • Scenario: Table to be sorted has B+ tree index on sorting column(s). • Idea: Can retrieve records in order by traversing leaf pages. • Is this a good idea? • Cases to consider: • B+ tree is clusteredGood idea! • B+ tree is not clusteredCould be a very bad idea!
Clustered B+ Tree Used for Sorting • Cost: root to the left-most leaf, then retrieve all leaf pages (Alternative 1) • If Alternative 2 is used? Additional cost of retrieving data records: each page fetched just once. Index (Directs search) Data Entries ("Sequence set") Data Records • Always better than external sorting!
Unclustered B+ Tree Used for Sorting • Alternative (2) for data entries; each data entry contains rid of a data record. In general, one I/O per data record! Index (Directs search) Data Entries ("Sequence set") Data Records
OUTPUT 1 Disk Bucket Sort • Suppose search key values are 0-K • B pages in memory, N pages in the file • Pass 0: Partition the file into B-1 intervals • Inervals are not runs! • If the interval fits in one page, sort it [0,K/(B-1)) . . . [K/(B-1),2K/(B-1)) OUTPUT 2 INPUT . . . OUTPUT B-1 [(B-2)K/(B-1),K) B Main memory buffers Disk
Bucket sort cost • What happens after pass 0 • ? intervals, each ? long • After pass 1? • ? intervals, each ? long • How much memory is required to sort in two passes? • Each interval is at most one page, or ? • Same as for external merge sort
External Merge Sort vs External Bucket Sort • Approximately the same I/O cost • Same memory requirement for two passes • Same number of passes required to sort • Bucket sort has less CPU cost • Bucketizing is much cheaper than sorting/merging • But bucket sort is subject to skew • Thus merge sort is used in practice