Stuff

Stuff • No stuff… CISC121 - Prof. McLeod

Last Time • Introduced The Analysis of Complexity CISC121 - Prof. McLeod

Today • Continue: The Analysis of Complexity CISC121 - Prof. McLeod

“Big O” Notation - Cont. • Common big O notations: • Constant O(1) • Logarithmic O(log(n)) • Linear O(n) • N log n O(n(log(n))) • Quadratic O(n2) • Cubic O(n3) • Polynomial O(nx) • Exponential O(2n) CISC121 - Prof. McLeod

“Big O” Notation - Cont. • On the next slide: how these functions vary with n. Assume a rate of 1 instruction per µsec (micro-second): CISC121 - Prof. McLeod

“Big O” Notation - Cont. CISC121 - Prof. McLeod

“Big O” Notation - Cont. • So, algorithms of O(n3) or O(2n) are not practical for any more than a few iterations. • Quadratic, O(n2) (common for nested “for” loops!) gets pretty ugly for n > 1000 (Bubble sort for example…). • To figure out the big O notation for more complicated algorithms, it is necessary to understand some mathematical concepts. CISC121 - Prof. McLeod

Some Math… • Logarithms and exponents - a quick review: • By definition: logb(a) = c, when a = bc CISC121 - Prof. McLeod

More Math… • Some rules when using logarithms and exponents - for a, b, c positive real numbers: • logb(ac) = logb(a) + logb(c) • logb(a/c) = logb(a) - logb(c) • logb(ac) = c(logb(a)) • logb(a) = (logc(a))/logc(b) • blogc(a) = alogc(b) • (ba)c = bac • babc = ba + c • ba/bc = ba - c CISC121 - Prof. McLeod

More Math… • Summations: • If a does not depend on i: CISC121 - Prof. McLeod

More Math… • Also: CISC121 - Prof. McLeod

Still More Math… • The Geometric Sum, for any real number, a > 0 and a 1. • Also for 0 < a < 1: CISC121 - Prof. McLeod

Still More Math… • Finally, for a > 0, a  1, and n  2: • Lots of other summation formulae exist, (Mathematicians just love to collect them!) but these are the ones commonly found in proofs of big O notations. CISC121 - Prof. McLeod

Properties of Big O Notation • If d(n) is O(f(n)), then ad(n) is still O(f(n)), for any constant, a > 0. • If d(n) is O(f(n)) and e(n) is O(g(n)), then d(n) + e(n) is O(f(n) + g(n)). • If d(n) is O(f(n)) and e(n) is O(g(n)), then d(n)e(n) is O(f(n)g(n)). • If d(n) is O(f(n)) and f(n) is O(g(n)), then d(n) is O(g(n)). CISC121 - Prof. McLeod

Properties of Big O Notation – Cont. • If f(n) is a polynomial of degree d (ie. f(n) = a0 + a1n + a2n2 + a3n3 + … + adnd, then f(n) is O(nd). • loga(n) is O(logb(n)) for any a, b > 0. That is to say loga(n) is O(log2(n)) for any a > 1. CISC121 - Prof. McLeod

“Big O” Notation - Cont. • While correct, it is considered “bad form” to include constants and lower order terms when presenting a Big O notation unless the other terms cannot be simplified and are close in magnitude. • For most comparisons between algorithms, the simplest representation of the notation is sufficient. • When simplifying a complex function always look for the obviously dominant term (for large n) and toss all the others. CISC121 - Prof. McLeod

Application of Big O to Loops • For example, a single “for” loop: sum = 0; for (i = 0; i < n; i++) sum += a[i]; • 2 operations before the loop, 6 operations inside the loop. Total is 2 + 6n. This gives the loop O(n). CISC121 - Prof. McLeod

Application of Big O to Loops - Cont. • Nested “for” loops, where both loop until n: for (i = 0; i < n; i++) { sum[i] = 0; for (j = 0; j < n; j++) sum[i] += a[i, j]; } • Outer loop: 1+5n+n(inner loop). Inner loop = 1+7n. Total = 1+6n+7n2, so algorithm is O(n2). CISC121 - Prof. McLeod

Application of Big O to Loops - Cont. • Nested “for” loops where the inner loop goes less than n times (used in simple sorts): for (i = 0; i < n; i++) { sum[i] = 0; for (j = 0; j < i; j++) sum[i] += a[i, j]; } • Outer loop: 1 + 6n + (7i, for i = 1, 2, 3, …, n-1). CISC121 - Prof. McLeod

Application of Big O to Loops - Cont. • Or: • Which is O(n) + O(n2), which is finally just O(n2). CISC121 - Prof. McLeod

Application of Big O to Loops - Cont. • Nested for loops are usually O(n2), but not always! for (i = 3; i < n; i++) { sum[i] = 0; for (j = i-3; j <= i; j++) sum[i] += a[i, j]; } • Inner loop runs 4 times for every value of i. So, this is actually O(n), linear, not quadratic. CISC121 - Prof. McLeod

Big O and Other Algorithms • For the rest of this course, as we develop algorithms for: • Searching • Sorting • Recursion • Etc. we will analyze the algorithm in terms of its “Big O” notation. • We’ll find out what’s good and what’s bad! CISC121 - Prof. McLeod

Sorting – Overview • Sorting algorithms can be compared on the basis of: • The number of comparisons for a dataset of size n, • The number of data movements (“swaps”) necessary, • How these measures depend on the state of the data (random, mostly in order, reverse order), and • How these measures change with n (“Big O” notation). CISC121 - Prof. McLeod

Simple Sorting Algorithms – Insertion Sort • Probably the most “instinctive” kind of sort: Find the location for an element and move all others up one, and insert the element. • Pseudocode: • Loop through array from i=1 to array.length-1, selecting element at position i = temp. • Locate position for temp (position j, where j <= i), and move all elements above j up one location • Put temp at position j. CISC121 - Prof. McLeod

Simple Sorting Algorithms – Insertion Sort – Cont. public static void insertionSort (int[] A) { int temp; int i, j; for (i=1; i < A.length; i++) { temp = A[i]; for (j=i; j>0 && temp < A[j-1]; j--) A[j] = A[j-1]; A[j] = temp; } // end for } // end insertionSort CISC121 - Prof. McLeod

Complexity of Insertion Sort • “A.length” is the same as “n”. • Outer loop is going to go n times, regardless of the state of the data. • How about the inner loop? • Best case (data in order) is O(n). • Worst and average case is O(n2). CISC121 - Prof. McLeod

Simple Sorting Algorithms – Selection Sort • This one works by selecting the smallest element and then putting it in its proper location. • Pseudocode: • Loop through the array from i=0 to one element short of the end of the array. • Select the smallest element in the array range from i plus one to the end of the array. • Swap this value with the value at position i. CISC121 - Prof. McLeod

Simple Sorting Algorithms – Selection Sort • First, a “swap” method that will be used by this and other sorts: public static void swap(int[] A, int pos1, int pos2) { int temp = A[pos1]; A[pos1] = A[pos2]; A[pos2] = temp; } // end swap CISC121 - Prof. McLeod

Simple Sorting Algorithms – Selection Sort public static void selectionSort(int[] A) { int i, j, least; for (i = 0; i < A.length-1; i++) { least = i; for (j = i+1; j < A.length; j++) if (A[j] < A[least]) least = j; if (i != least) swap(A, least, i); } // end for } // end selectionSort CISC121 - Prof. McLeod

Complexity of Selection Sort • Outer loop goes n times, just like Insertion sort. • Does the number of executions of the inner loop depend on the state of the data? • (See slides 19 and 20) The complexity for comparisons is O(n2), regardless of the state of the data. • The complexity of swaps is O(n), since the swap only takes place in the outer loop. CISC121 - Prof. McLeod

Simple Sorting Algorithms – Bubble Sort • Is best envisioned as a vertical column of numbers as bubbles. The larger bubbles gradually work their way to the top of the column, with the smaller ones pushed down to the bottom. • Pseudocode: • Loop through array from i=0 to length of array. • Loop down from the last element in the array to i. • Swap adjacent elements if they are in the wrong order. CISC121 - Prof. McLeod

Simple Sorting Algorithms – Bubble Sort – Cont. public static void bubbleSort( int[] A) { int i, j; for (i=0; i < A.length; i++) for (j = A.length-1; j > i; j--) if (A[j] < A[j-1]) swap(A, j, j-1); } // end bubbleSort CISC121 - Prof. McLeod

Complexity of Bubble Sort • Outer loop goes n times, as usual. • Both the comparison and swap are in the inner loop (yuk!) • Best case (data already in order): comparisons are still O(n2), but swaps will be O(n). • Bubble sort can be improved (slightly) by adding a flag to the inner loop. If you loop through the array without swapping, then the data is in order, and the method can be halted. CISC121 - Prof. McLeod

Measurements of Execution Speed • Refer to the Excel spreadsheet “SortintTimes.xls”: • Sorted between 1000 and 10,000 random int values. • Used System.currentTimeMillis() to measure execution time. • Measured times for completely random array (average case), almost sorted array (best case) and reverse order array (worst case). CISC121 - Prof. McLeod

Choosing Between the Simple Sorts • (Assuming you have less than about 10,000 data items…) • First: what is the state of the data? • Second: What is slower: comparisons or swaps? • Third: Never use bubble sort, anyways! • Since both selection and insertion are O(n2), you will need to look at the coefficient (c), calculated using sample data, to choose between them. CISC121 - Prof. McLeod

Sorting Animations • For a collection of animation links see: http://www.hig.no/~algmet/animate.html • Here are a couple that I liked: http://www.cs.pitt.edu/~kirk/cs1501/animations/Sort3.html http://cs.smith.edu/~thiebaut/java/sort/demo.html CISC121 - Prof. McLeod

Sequential Search • Sequential search pseudocode: • Loop through array starting at the first element until the value of target matches one of the array elements. • If a match is not found, return –1. CISC121 - Prof. McLeod

Sequential Search, Cont. • As code: public int seqSearch(int[] A, int target) { for (int i = 0; i < A.length; i++) if (A[i] == target) return i; return –1; } CISC121 - Prof. McLeod

Complexity of Sequential Search • Best case - first element is the match: O(1). • Worst case - element not found: O(n). • Average case - element in middle: O(n). CISC121 - Prof. McLeod

Binary Search • Binary search pseudocode. Only works on ordered sets: • Locate midpoint of array to search. b) Determine if target is in lower half or upper half of array. • If in lower half, make this half the array to search. • If in upper half, make this half the array to search. • Loop back to step a), unless the size of the array to search is one, and this element does not match, in which case return –1. CISC121 - Prof. McLeod

Binary Search – Cont. public int binSearch (int[] A, int key) { int lo = 0; int hi = A.length - 1; int mid = (lo + hi) / 2; while (lo <= hi) { if (key < A[mid]) hi = mid - 1; else if (A[mid] < key) lo = mid + 1; else return mid; mid = (lo + hi) / 2; } return -1; } CISC121 - Prof. McLeod

Complexity of Binary Search • For the best case, the element matches right at the middle of the array, and the loop only executes once. • For the worst case, key will not be found and the maximum number of loops will occur. • Note that the loop will execute until there is only one element left that does not match. • Each time through the loop the number of elements left is halved, giving the progression below for the number of elements: CISC121 - Prof. McLeod

Complexity of Binary Search, Cont. • Number of elements to be searched - progression: • The last comparison is for n/2m, when the number of elements is one (worst case). • So, n/2m = 1, or n = 2m. • Or, m = log2(n). • So, the algorithm is O(log(n)) for the worst case. CISC121 - Prof. McLeod

Binary Search - Cont. • Binary search at O(log(n)) is much better than sequential at O(n). • Major reason to sort datasets! CISC121 - Prof. McLeod

A Caution on the Use of “Big O” • Big O notation usually ignores the constant c. • The constant’s magnitude depends on external factors (hardware, OS, etc.) • For example, algorithm A has the number of operations = 108n, and algorithm B has the form 10n2. • A is then O(n), and B is O(n2). • Based on this alone, A would be chosen over B. • But, if c is considered, B would be faster for n up to 107 - which is a very large dataset! • So, in this case, B would be preferred over A, in spite of the Big O notation. CISC121 - Prof. McLeod

A Caution on the Use of “Big O” – Cont. • For “mission critical” algorithm comparison, nothing beats experiments where actual times are measured. • When measuring times, keep everything else constant, just change the algorithm (same hardware, OS, type of data, etc.). CISC121 - Prof. McLeod

Stuff

Stuff

Presentation Transcript

Stuff, including interfederation stuff

Stuff

Stuff!

KNOW YOUR STUFF? KNOW YOUR STUFF! KNOW YOUR STUFF?

Stuff

Stuff

STUFF

Stuff

Stuff

Stuff, Stuff, Stuff What Do We Do With It???

Stuff

Stuff

Stuff

Stuff

Stuff

Introductory Stuff

WRIGHT STUFF

Stuff

“STUFF”

Stuff

Stuff

Stuff