1 / 59

DATA STRUCTURE

DATA STRUCTURE. Instructor: Dai Min Office: XNA602 Fall 2006. CHAPTER 7 Sorting. What is sorting Inserting Sorting Insertion sort Shell sort Exchange Sorting Bubble sort QuickSort Selection Sorting Selection Sort HeapSort MergeSort Radix Sort (Bucket Sort).

phyre
Download Presentation

DATA STRUCTURE

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DATA STRUCTURE Instructor: Dai Min Office: XNA602 Fall 2006

  2. CHAPTER 7 Sorting • What is sorting • Inserting Sorting • Insertion sort • Shell sort • Exchange Sorting • Bubble sort • QuickSort • Selection Sorting • Selection Sort • HeapSort • MergeSort • Radix Sort (Bucket Sort)

  3. Sorting Problem • Sorting: For a given list of records (R1, R2, …, Rn), where Ri = Ki + data, arrange the elements so that they are in • ascending order K1 <= K2 ,<= ... <= Kn or in • descending order K1 >= K2 >=...>= Kn • An ordered list is a list in which each entry has a key, and the keys are in order. • Stable: if i < j and Ki = Kj and Ri precedes Rj in the sorted list, then the sorting algorithm is stable.

  4. internal sort vs. external sort • Categories: • Insertion sort and Shell sort • Bubble sort and QuickSort • Selection Sort and HeapSort • MergeSort and Bucket Sort • Some O(n2) sorting schemes • easy to understand and to implement • not very efficient, especially for large data sets

  5. Criteria:To analyze sorting algorithms, we consider both the number of comparisons of keys and the number of times entries must be moved inside a list, particularly in a contiguous list. • # of key comparisons • # of data movements • Both the worst-case and the average performance of a sorting algorithm are of interest. To find the average, we shall consider what would happen if the algorithm were run on all possible orderings of the list.

  6. Sortable Lists • Structure of data element typedef struct { int key; //表示排序关键字 elemtype otherinfo; //排序记录中的其他数据项 } Snode; • Structure of sortable list typedef struct { Snode R[Maxsize+1]; //存放待排序全体记录 int length; //排序记录个数 } SList;

  7. 7.1 Insertion Sorting 7.1.1 Insertion Sorting 1) Basic idea:Insertion sorts are based on the process of repeatedly inserting a new element into already sorted list. • The list is regarded as two parts: a sorted sublist and a un-sorted sublist. In initially,the sorted sublist is [R1], un-sorted sublist is [R2….Rn],and i references the firstelement of un-sorted sublist. • When i≤n repeat: the first element of un-sorted sublist--Riis inserted into its proper position among the already sorted list [R’1, R’2 ,..., R’i-1]

  8. At the i- th stage, Riis inserted into its proper position among the already sorted R’1,R’2 ,...,R’i-1. • Compare Kiwith each of these elements, starting from the right end, and shift them to the right as necessary. • Use array position 0 to store a copyof Kito prevent “falling off the left end” in these right-to-left scans.

  9. 49 25 25* 21 16 08 1 2 3 4 5 6 49 25 25* 25 21 16 i = 2 08 1 2 3 4 5 6 temp 49 49 25 25* 21 16 i = 3 08 1 2 3 4 5 6 temp Example:

  10. 49 25 25* 25* 21 16 i = 4 08 1 2 3 4 5 6 49 25 25* 21 16 16 i = 5 08 1 2 3 4 5 6 temp 49 25 25* 21 16 i = 6 08 08 1 2 3 4 5 6 temp 49 25 25* 21 16 Finish 08 1 2 3 4 5 6

  11. 49 16 i = 5 j = 4 25 25* 21 16 16 08 1 2 3 4 5 6 temp 49 49 i = 5 j = 3 16 25 25* 21 16 08 1 2 3 4 5 6 temp • The sorting process when i = 5

  12. i = 5 j = 2 49 16 25 25* 25* 21 16 08 0 1 2 3 4 5 49 i = 5 j = 1 16 25 25 25* 21 16 08 0 1 2 3 4 5 temp 49 i = 5 j = 0 16 25 25* 21 21 16 08 0 1 2 3 4 5 temp

  13. 3)Alogrithm: • void InsertSort(SqList &L) • { for(i=2;i<=L.length;i++) • if(L.R[i].key < L.R[i-1].key) • //小于时,将R[i]插入有序表 • { L.R[0] =L.R[i]; // R[0] is a sentinel • for( j=i-1;L.R[0].key < L.R[j].key;j--) • L.R[j+1]=L.R[j]; /*记录后移*/ • L.R[j+1]=L.R[0]; /*插入到正确位置*/ • } • }

  14. 4) Analysis: O(n2) • Time Complexity: • Best case(when the list is already in order): • Comparisons of keys: • Move element: 0 • Worse case(when the list is in reverse order): • Comparisons of keys: • Move element: • Average case: • Comparisons of keys: • Move element: • Space Complexity:O(1) • Insertion Sorting is stable.

  15. 5) Variation • Binary insertion sort • sequential search --> binary search • reduce # of comparisons, # of moves unchanged • List insertion sort • array --> linked list • sequential search, move --> 0

  16. 7.1.2 Shell Sort 1) Basic idea: group the list into several sublist and apply insertion sort in each sublist. Continue in this manner, finally apply insertion sort to all records. • First, group the list into several sublist by interval d1 • Apply insertion sort in each sublist. • Group the list again by interval d2(d2<d1<n), and apply insertion sort in each sublist. • Continue in this manner, until dt=1.

  17. 49 38 65 97 76 13 27 48 55 4 Example: 4938659776 132748554 取d1=5 一趟分组: 一趟排序: 132748 554 49386597 76 132748 5544938659776 取d2=3 二趟分组: 二趟排序: 134 48 38274955659776 取d3=1 三趟分组: 13 27 48 55 4 49 38 65 97 76 三趟排序: 4 13 27 38 48 49 55 65 76 97

  18. At the very start of Shell sorting, d is larger and the size of each sublist is small. Insertion sort in every sublist is efficient. With dbecome more smaller, the size of every sublist more larger, but the sublists become rough ordered. Insertion sort in every sublist is also efficient. • How to determine d : • Shell: d1= n/2, di+1= di/2, … dt = 1 • Knuth: di+1=  di-1/3 • dtis prime number….. In any case, dt = 1

  19. 2) Analysis: • Time Complexity:n1.25 ~ 1.6n1.25 • Space Complexity : O(1) • Shell Sorting is not stable.

  20. 7.2 Exchange Sorting • Exchange sorts systematically interchange pairs of elements that are out of order until eventually no such pairs remain => list is sorted. • One example of an exchange sort isbubble sort: very inefficient, but quite easy to understand.

  21. 7.2.1 Bubble Sort 1) Basic idea: • On the first pass: compare the first two elements, and interchange them when they are out of order. Next compare the second and third elements… until compare the last two elements. • The largest element in the list will “sink” to the end of the list, since it will obviously be moved past all smaller elements. • Some of the smaller items have “bubbled up” toward their proper positions nearer the front of the list. • Scan and compare the list again, leaving out the last item (already in its proper position). • Repeat it until there is not out of order in the list.

  22. 2) Algorithm voidbubble_Sort (int a[], int n )//起泡排序算法 { for ( int i = n-1, change =1; i>=1 &&change; i--) { change =0; for (j=0;j<i; j++) if ( a[j] > a[j+1] ) {Swap ( j +1, j );//发生逆序 change = 1;//做“发生了交换”标志 } } }

  23. Example 49 25 25* 21 16 08 1 2 3 4 5 6 49 25 25* 21 i = 1 16 08 Exchang=1 49 25* 25 i = 2 21 16 08 Exchang=1

  24. 49 25* 25 i = 3 21 16 08 Exchang=1 49 25* 25 21 i = 4 16 08 Exchang=0 1 2 3 4 5 6

  25. 3) Analysis • Time Complexity: • Best case(when the list is already in order): Only n-1 comparisons and no element move. • Worse case(when the list is in reverse order) • Comparisons of keys: • Move element: • Space Complexity:O(1) • Bubble Sorting is stable. O(n2)

  26. 7.2.2 QuickSort • A more efficient exchange sorting scheme than bubble sort because a typical exchange involves elements that are far apart  fewer interchanges are required to correctly position an element.

  27. 1) Basic idea • Quicksort uses a divide-and-conquer strategy • a recursive approach to problem-solving in which • the original problem partitioned into simpler sub-problems, • each subproblem considered independently. • Subdivision continues until subproblems obtained are simple enough to be solved directly  • Choose some element called a pivot. Perform a sequence of exchanges so that • all elements that are less than this pivot are to its left and • all elements that are greater than the pivot are to its right.  divides the (sub)list into two smaller sublists, each of which may then be sorted independently in the same way.

  28. QuickSort • If the list has 0 or 1 elements, return. • Else do: • Pick an element in the list to use as the pivot. • Split the remaining elements into two disjoint groups: • SmallerThanPivot = {all elements < pivot} • LargerThanPivot = {all elements > pivot} • Return the list rearranged as: Quicksort(SmallerThanPivot), pivot, Quicksort(LargerThanPivot)

  29. Example: The first partition 49 25* 25 21 16 21 08 • 2 3 4 5 6 i j 49 25 25* 16 08 08 j i i 49 25 25 25* 16 08 i j j 49 25 25* 16 16 08 i i j

  30. 49 25 25* 16 16 08 j i i 49 49 25 25* 16 08 i j j j 49 49 25 25* 16 08 i j 21 49 25 25* 21 16 08

  31. 2) Analysis • Time complexity: • Best case:Always partition in half. Position a list with n element needs O(n). T(n) is the time taken to sort n elementsT(n)<=cn+2T(n/2) for some c <=cn+2(cn/2+2T(n/4)) ... <=cnlogn+nT(1)=O(n logn) • Bad case: One of the partitioned sublist may be empty. • Average case: O(n logn)

  32. Space complexity: • Average case and best case: O(log n) • Worst case: O(n) • QuickSort is not stable. • Optimizations for Quicksort: • Better Pivot • First or last entry: Worst case appears for a list already sorted or in reverse order. • Central entry: Poor cases appear only for unusual orders. • Random entry: Poor cases are very unlikely to occur. • Better algorithm for small sublists • Eliminate recursion

  33. 7.3 Selection Sorting • Basic idea: make a number of passes through the list or a part of the list and, on each pass, select one element to be correctly positioned.

  34. 7.3.1 Selection Sort 1) Basic idea: • The list is regarded as two parts: a sorted sublist and a un-sorted sublist. In initially,the sorted sublist is [ ], un-sorted sublist is [R1,R2….Rn]. • In each pass, the smallest element in the un-sorted sublist might be found and then moved to its proper location. • Continue in this manner. After n-1 times selection and insertion, the list is in order.

  35. At the i- th stage, the sorted sublist is [R’1,R’2….R’i-1 ], un-sorted sublist is [Ri,Ri+1….Rn]. • Scan the unsorted sublist to locate the smallest element and find it in position. • Interchange this element with the ith element (Ri).

  36. 49 25 25* 21 16 Initialization 08 1 2 3 4 5 6 49 25 25* smallest 08 Interchange21,08 i = 1 21 16 08 49 smallest 16 Interchange25,16 i = 2 25 25* 21 16 08 49 25* 25 smallest 21 Interchange49,21 i = 3 21 16 08

  37. 49 smallest 25* No Interchange 25* 25 i = 4 21 16 08 1 2 3 4 5 6 49 smallest 25 No Interchange 25* 25 i = 5 21 16 08 49 25* 25 21 Finish 16 08

  38. 2) Analysis O(n2) • Time complexity: • Movement of elements • Best case: 0 • Bad case: 3(n-1) • Comparison: • On the first pass through the list, the first item is compared with each of the n - 1 elements that follow it; On the second pass, the second element is compared with the n - 2 elements following it, etc.  • A total of (n-1) + (n-2) + … + 1 = (n*(n-1))/2 comparisons thus required for any list • Space complexity: O(1) • Selection Sort is not stable.

  39. kik2i kik2i+1 kik2i kik2i+1 OR (i=1,2,…...n/2) 7.3.2 Heaps and HeapSort 1) What is a heap? • Definition: A heap is a list (R1,R2….Rn)), in which each entry contains a key, (K1,K2….Kn), and every Ki in the list satisfies:

  40. 13 38 27 96 50 76 65 49 83 27 97 38 11 9 E.g. (96,83,27,38,11,9) E.g. (13,38,27,50,76,65,49,97) MaxHeap MinHeap A heap list can be regard as a complete binary tree. The root of the tree must be the largest (smallest) element in the list.

  41. 2) HeapSort • Basic idea: • The list is regarded as two parts: a sorted sublist and a un-sorted sublist. In initially,the sorted sublist is [ ], un-sorted sublist is [R1,R2….Rn]. • Regarding the un-sorted list [R1,R2….Rn] as a complete binary tree stored in an array, and convert the complete binary tree to a heap. The the root of the tree, R[1], is thelargest (smallest)element in the list. • Swap R[1] with the end element of the un-sorted sublist R[i], and put it into sorted sublist. • Convert the un-sorted sublist [R1,R2….Ri-1] to a heap. • Continue in this manner, until the un-sorted sublist is empty. (n-1)

  42. The difficulties: • How to convert a complete binary tree to a heap? —— Build Heap • After elements swapping, how to convert the un-sorted sublist [R1,R2….Ri-1] to a heap? —— Heap Adjust • Solution of the second problem——Percolate-Down

  43. 4) Build Heap • Apply the Percolate-Down algorithm to tree that root number from n/2 to 1, and convert these trees into heaps, respectively. • The Percolate-Down algorithm is applied to nodes n/2, n/2-1,…1——Down-Up

  44. 5) Analysis • Time complexity:O(nlog2n) • Space complexity:O(1) • Heapsort is not stable. • Heapsort is suitable only for contiguous lists.

  45. 7.4 Merge Sort • Sorting schemes are • internal--designed for data items stored in main memory • external--designed for data items stored in secondary memory. • Previous sorting schemes are all internal sorting: • required direct access to list elements(not possible for sequential files) • made many passes through the list(not practical for files) • Mergesort can be used both as an internal and an external sort. • basic operation is merging, that is, combining two lists that have previously been sorted so that the resulting list is also sorted.

  46. 1) Merge Given two sorted lists(list[i], …, list[m]) (list[m+1], …, list[n])generate a single sorted list(sorted[i], …, sorted[n]) • We chop the list into two sublists of sizes as nearly equal as possible and then sort them separately. Afterward, we carefully merge the two sorted sublists into a single sorted list.

  47. 2) MergeSort • Basic idea: • If the list is(R1,R2….Rn), we chop the list into n sublists (the size of each sublist is 1), and perform merging for contiguous two sublist of them separately. We get n/2 sorted sublists —— one pass mergesort • Merge the n/2 sorted sublists into a n/4 sorted list. • Continue in this manner, until merge the two sorted sublists into a single sorted list.

  48. Example 初始关键字: [49] [38] [65] [97] [76] [13] [27] 一趟归并后: [38 49] [65 97] [13 76] [27] 二趟归并后: [38 49 65 97] [13 27 76] 三趟归并后: [13 27 38 49 65 76 97]

  49. len=1 21 25 49 25* 93 62 72 08 37 16 54 len=2 21 25 25* 49 62 93 08 72 16 37 54 len=4 21 25 25* 49 08 62 72 93 16 37 54 len=8 08 21 25 25* 49 62 72 93 16 37 54 len=16 08 16 21 25 25* 37 49 54 62 72 93

  50. 3) Analysis • Time complexity:O(nlog2n) • Space complexity:O(n) Mergesort requires twice the space. • MergeSort is stable. • Mergsort is also good for sorting linked lists.

More Related