80 likes | 191 Views
7.11 External Sorting. Access to secondary storage is orders of magnitude slower than memory access. Minimize access to secondary storage (tape or disk). Also may want to read data sequentially (tapes). 7.11 External Sorting.
E N D
7.11 External Sorting • Access to secondary storage is orders of magnitude slower than memory access. • Minimize access to secondary storage (tape or disk). • Also may want to read data sequentially (tapes).
7.11 External Sorting • Simple merge example - sorting M records at a time (M=3), with 4 tapes (Ta1,Ta2, Tb1, Tb2) Ta1 81 94 11 ; 96 12 35 ; 17 99 28 ; 58 41 75 ; 15 Ta2 Tb1, Tb2 empty
7.11 External Sorting Ta1, Ta2 empty Tb1 11 81 94 ; 17 28 99 ; 15 Tb2 12 35 96 ; 41 58 75 Ta1 11 12 35 81 94 96 ; 15 Ta2 17 28 41 58 75 99 Tb1, Tb2 empty
7.11 External Sorting • read M records at a time and sort internally • a set of sorted records is called a run • it will require log(N/M) passes, plus the initial run-constructing pass • given 10 million records of 128 bytes, and 4 M bytes of internal memory N=10*106, M=4*106/128, # of runs = N/M = 320 # of passes = log(N/M) + 1= 10
7.11 External Sorting Ta1, Ta2 empty Tb1 11 12 17 28 35 41 58 75 81 94 96 99 Tb2 15 Ta1 11 12 15 17 28 35 41 58 75 81 94 96 99 Ta2 Tb1, Tb2 empty
7.11 External Sorting • Multiway Merge • k inputdevices instead of just 2 • e.g, k=3 for the previous example Ta1 81 94 11 ; 96 12 35 ; 17 99 28 ; 58 41 75 ; 15 Ta2 Ta3 Tb1, Tb2, Tb3 empty
7.11 External Sorting Ta1, Ta2, Ta3 empty Tb1 11 81 94 ; 41 58 75 Tb2 12 35 96 ; 15 Tb3 17 28 99 Ta1 11 12 17 28 35 81 94 96 99 Ta2 15 41 58 75 Ta3 Tb1, Tb2, Tb3 empty
7.11 External Sorting Ta1, Ta2, Ta3 empty Tb1 11 12 15 17 28 35 41 58 75 81 94 96 99 Tb2 , Tb3 empty • it will require logk(N/M) passes, plus the initial run-constructing pass • for N=10*106, M=4*106/128, # of passes = log5(10*128/4) + 1= 5 Skip rest of Chapter 7