230 likes | 255 Views
ITEC 2620M Introduction to Data Structures. Instructor: Prof. Z. Yang Course Website: http://people.math.yorku.ca/~zyang/itec2620m.htm Office: DB 3049. Sorting. Key Points. Non-recursive sorting algorithms Selection Sort Insertion Sort Best, Average, and Worst cases. Sorting.
E N D
ITEC 2620MIntroduction to Data Structures Instructor: Prof. Z. Yang Course Website: http://people.math.yorku.ca/~zyang/itec2620m.htm Office: DB 3049
Key Points • Non-recursive sorting algorithms • Selection Sort • Insertion Sort • Best, Average, and Worst cases
Sorting • Why is sorting important? • Easier to search sorted data sets • Searching and sorting are primary problems of computer science • Sorting in general • Arrange a set of elements by their “keys” in increasing/decreasing order. • Example: How would we sort a deck of cards?
Selection Sort • Find the smallest unsorted value, and move it into position. • What do you do with what was previously there? • “swap” it to where the smallest value was • sorting can work on the original array • Example
Pseudocode • Loop through all elements (have to put n elements into place) • for loop • Loop through all remaining elements and find smallest • initialization, for loop, and branch • Swap smallest element into correct place • single method
Insertion Sort • Get the next value, and push it over until it is “semi” sorted. • elements in selection sort are in their final position • elements in insertion sort can still move • which do you use to organize a pile of paper? • Example
Pseudocode • Loop through all elements (have to put n elements into place) • for loop • Loop through all sorted elements, and swap until slot is found • while loop • swap method
Bubble Sort • Slow and unintuitive…useless • Relay race • pair-wise swaps until smaller value is found • smaller value is then swapped up
Cost of Sorting • Selection Sort • Is there a best, worst, and average case? • two for loops always the same • n elements in outer loop • n-1, n-2, n-3, …, 2, 1 elements in inner loop • average n/2 elements for each pass of the outer loop • n * n/2 compares
Cost of Sorting (Cont’d) • Insertion Sort • Worst – same as selection sort, next element swaps till end • n * n/2 compares • Best – next element is already sorted, no swaps • n * 1 compares • Average – linear search, 50% of values n/4 • n * n/4 compares
Value of Sorting • Current cost of sorting is roughly n2compares • Toronto phone book • 2 million records • 4 trillion compares to sort • Linear search • 1 million compares • Binary search • 20 compares
Trade-Offs • Write a method, or re-implement each time? • Buy a parking pass, or pay cash each time? • Sort in advance, or do linear search each time? • Trade-offs are an important part of program design • which component should you optimize? • is the cost of optimization worth the savings?
Key Points • Analysis of non-recursive algorithms • Estimation • Complexity Analysis • Big-Oh Notation
Factors in Cost Estimation • Does the program’s execution depend on the input? • Math.max(a, b); • always processes two numbers constant time • maxValue(anArray); • processes n numbers varies with array size
Value of Cost Estimation • Constant time programs • run once, always the same… • estimation not really required • Variable time programs • run once • future runs depend on relative size of input • based on what function?
Cost Analysis • Consider the following code: sum = 0; for (i=1; i<=n; i++) for (j=1; j<=n; j++) sum++; • It takes longer when n is larger.
Asymptotic Analysis • “What is the ultimate growth rate of an algorithm as a function of its input size?” • “If the problem size doubles, approximately how much longer will it take?” • Quadratic • Linear (linear search) • Logarithmic (binary search) • Exponential
Big-Oh Notation • Big-Oh represents the “order of” the cost function • ignoring all constants, find the largest function of n in the cost function • Selection Sort • n * n/2 compares + n swaps • O(n2) • Linear Search • n compares + n increments + 1 initialization • O(n)
Simplifying Conventions • Only focus on the largest function of n • Ignore smaller terms • Ignore constants
Examples • Example 1: • Matrix multiplication Anm * Bmn = Cnn • Example 2: selectionSort(a); for (int i = 0; i < n; i++) binarySearch(i,a);
Trade-Offs and Limitations • “What is the dominant term in the cost function?” • What if the constant term is larger than n? • What happens if both algorithms have the same complexity? • Selection sort and Insertion sort are both O(n2) • Constants can matter • Same complexity (obvious) and different complexity (problem size)