91.102 - Computing II

91.102 - Computing II Efficiency, Notation and Mathematical Ideas. How can we tell that “one way of doing something is better than another”? The answer has to be pretty much independent of when the question is asked and of what hardware we are using to execute the program on. Case in point (Fall 98): Apple Computer claims that its G3 processors are roughly twice as fast - in terms of throughput - as Intel Pentium II processors running at the same clock speed, and has the benchmarks to “prove it”. A PC magazine, using different benchmarks shows that Intel processors running at the same clock speed can be many times faster than the G3s doing the same tasks. Who’s right??

91.102 - Computing II Both and neither… How’s that possible? By comparing apples and oranges, and exploiting the fact that programs run in a complete environment, using all kinds of resources. Example: if you run a graphics 3D program, and one machine has a very good graphics accelerator, and the other a mediocre one, the performance difference can be a factor of 10 or more. Neither really measures CPU performance… On the other hand, showing that one CPU can perform integer operations three times as fast as the other may not lead to any advantage that an “average user” can see, unless the whole system is designed to exploit this capability: unfortunately, it is user programs that determine what kind of operations they perform most...

91.102 - Computing II Furthermore, compilers differ enormously in the quality of the code they generate, and NOT uniformly over all the code. The very different architectures of the Apple and Wintel platforms provide a wonderful environment for creating skewed comparisons and conflicting claims that cannot be resolved to everyone’s satisfaction… a bit like politics…

91.102 - Computing II Another case in point (Oct. 2001): AMD vs Intel. Athlon vs Pentium. AMD wants to push for a benchmark different from the "Gigahertz" one - for the same reason Apple pushed for it in 1998: GH and throughput are NOT The same thing…

91.102 - Computing II So, what do we do? We must find ways that are “essentially independent” of hardware and language - where only the algorithm “really counts”. This means that we will get results that are “general” but maybe not so precise that we can “really decide” between two ways of performing a task that give us nearly equal predictions. This is already hard enough, as it turns out...

91.102 - Computing II A reasonable measure to use is the size of the input to the program: generally, the more items in the input, the more time and space it will take to generate the output. TIME and SPACE are the other measures: how long will it take the program to run, given an input of a certain size? How much space will the program require give an input of a certain size? A reasonable criterion to use to set up a time comparison is the dominant operation one: which operation (assignment, comparison, function call, etc.) is the one that best characterizes the processing of the input data?

91.102 - Computing II One could pick some input data, run the program and measure. Why is that not enough? Unfortunately the size of the input may not be a reliable predictor, in the sense that different runs with different inputs of the same size could give very different results. The empirical results simply compare size of some specific inputs to time-to-completion or space used: it's up to the analyst to determine how the relation between input size and the results can be best understood as a general problem… this is what makes the whole thing quite hard, but might explain fluctuations in the input/resource equation.

91.102 - Computing II As usual, let’s pick some simple algorithm and try to find out what is going on. The text takes SelectionSort, and runs it with input sets of different sizes on two different machines. It gets a table: Array Size = n Home Computer Desktop Computer 125 12.5 2.8 250 49.3 11.0 500 195.8 43.4 1000 780.3 172.4 2000 3114.9 690.5

91.102 - Computing II Timeout: Selection Sorting in DESCENDING order. void SelectionSort(ItemType *inArray, int m, int n) { int maxPosition; int temp; if (m < n) { maxPosition = FindMax(inArray, m, n); temp = inArray[m]; inArray[m] = inArray[maxPosition]; inArray[maxPosition] = temp; SelectionSort(inArray, m+1, n); } }

91.102 - Computing II 0 1 2 3 4 5 6 7 8 9 A = 7 3 0 1 9 6 5 2 8 4 4 <- FindMax(A,0,9) 0 0 1 2 3 4 5 6 7 8 9 A = 9 3 0 1 7 6 5 2 8 4

91.102 - Computing II Let’s now plot the points:

91.102 - Computing II Most spreadsheets provide a facility for “curve fitting”, i.e, they find a curve that “fits” a set of data points according to some criteria - usually sensible. What would we get in this case?

91.102 - Computing II We plot and compute the “trendlines”:

91.102 - Computing II You may have observed that the trendlines computed by Excel are not identical to the trendlines provided by the text - they probably used slightly different algorithms (software?) to get there… The important thing is that the two functions allow us to “extrapolate” the cost of running SelectionSort (on the two different kinds of machines) on Data Sets of sizes DIFFERENT from the ones of the empirical study. We also found that there seems to be a simple relationship (well approximated by a quadratic function) between run time and the size of the set to be sorted...

91.102 - Computing II The basic idea is that we introduce a notation: O (big-Oh) to tell us what kind of trend we can expect. In the two cases we just saw, we have F1(n) = 0.0008 n2 + 0.0032 n + 0.0627 and F2(n) = 0.0002 n2 + 0.0005 n + 0.0784 Since, intuitively, the square of a number grows faster than the number itself (or a constant), and the leading coefficients are positive, both functions will grow, for large n, not much worse than 0.0008 n2 and 0.0002 n2, respectively. It is fairly easy to show that there exist constants C1 > 0.0008 and C2 > 0.0002 so that F1(n) <= C1 n2 and F2(n) <= C2 n2 for all “large” n.

91.102 - Computing II We say that F1(n) = O(n2), and also that F2(n) = O(n2). The meaning of this notation (repeating ourselves) is that there exist constants C1 and C2 and positive integers N1 and N2 such that F1(n) <= C1 n2 for all n >= N1, and F2(n) <= C2 n2 for all n >= N2. So, in some way, our notation does not really distinguish between the two… they just both grow NO WORSE than “quadratically”, even though one grows faster than the other. The crucial thing turns out to be the n2: otherwise, for large n, F1(n) is always about four times larger than F2(n).

91.102 - Computing II Some Growth Comparisons: F(n) n log10(n) n1/2 n log10(n) n2 n3 2n nn 1 0 1 0 1 1 2 1 10 1 100.5 10 102 103 1024 1010 102 2 101 102*2 104 106 ≈1030 10200 103 3 101.5 103*3 106 109 ≈10300 104 4 102 104*4 108 1012 105 5 102.5 105*5 1010 1015 106 6 103 106*6 1012 1018 107 7 103.5 107*7 1014 1021 108 8 104 108*8 1016 1024 109 9 104.5 109*9 1018 1027 1010 10 105 1010*10 1020 1030

91.102 - Computing II To keep things in perspective, there are, roughly, 3.15*1013 microseconds in a year, and only 1000 times as many nanoseconds… Looking back at the table, there may well be problems where the algorithm chosen for solution - if it matches one of the faster growing functions - will never run to completion on more than trivially small sets of data… We need strategies to examine a proposed algorithm and determine some bound on the amount of time we expect it to run for a data set of given size.

91.102 - Computing II Problems: a) how do we come up with a formula? b) is there an “algebra” of formulae? The second question really means: if we examine two algorithms and get two formulae f1(n) and f2(n), how do we compare such formulae and what do we do if, for example, the two algorithms need to be run sequentially - or in some other relationship to each other - over the same data set? We will now give some answers to the second question, the first one being - unfortunately - much harder...

91.102 - Computing II 1) if f1(n) = a0 + a1*n + a2*n2 + … + am*nm, for large enough n we can get a decent approximation by just looking at the term of highest degree: f1(n) ≈ am*nm. So we can conclude that f1(n) is in (or of type) O(nm) This needs proof, but intuition should be adequate for now.

91.102 - Computing II 2) If two functions f1(n) and f2(n) are both in the same O-class, say O(n17), and the coefficients of the highest degree terms are of the same sign (positive, for our purposes), then their sum is in the same O-class. If they belong to two different O-classes, their sum will belong to the ”larger” of the two O-classes. Ex: f1(n)  O(n3) and f2(n)  O(n5) then f1(n) + f2(n)  O(n5). This corresponds to two successive function calls, two successive program fragments, etc., where you have been able to get estimates separately, and now you want to estimate the total effect: up to a multiplicative constant, it will be no worse than that of the single worst fragment.

91.102 - Computing II 3) If two functions f1(n) and f2(n) are in O(na) and O(nb), respectively, their product f1(n) * f2(n) is of class O(na+b) This (usually) corresponds to a loop (or a recursion) , where f1(n) is, for example, the number of times the loop executes (or a function is called), and f2(n) is the cost of one execution of the loop (function) body. 4) If two functions f1(n) and f2(n) are in O(na) and O(nb), respectively, their composition f1(f2(n)) (if it makes sense) is of class O(nab). We are unlikely to see much of this, unless the output of f1 is of the same type as the input of f2. Space -> Time functions can’t be meaningfully composed.

91.102 - Computing II The solution to question a) is the really hard part: we will spend a fair amount of time examining all the algorithms we’ll study trying to acquire both the techniques and the intuition to derive some formulae. Let's look at SelectionSort as a first "practice" run… SelectionSort(A, 0, n); where A is the input array and n is the largest index that contains a value.

91.102 - Computing II Selection Sorting in DESCENDING order. void SelectionSort(ItemType *inArray, int m, int n) { int maxPosition; int temp; if (m < n) { maxPosition = FindMax(inArray, m, n); temp = inArray[m]; inArray[m] = inArray[maxPosition]; inArray[maxPosition] = temp; SelectionSort(inArray, m+1, n); } // else do nothing and return... } If we start with m = 0, we go through n +1 recursive calls. Each call (minus the last one) will perform one call to FindMax, one "swap" and one to SelectionSort.

91.102 - Computing II int FindMax(ItemType *inArray, int m, int n) { int i = m; int j = m; do { i++; // creep up the array if (inArray[i] > inArray[j]) { // so, numbers... j = i; // if you found something bigger... } } while (i != n); // all the way to the end return j; // return index of largest } The "do loop" contains one incrementation, two(?) comparisons and, possibly, one assignment. The loop itself is executed n - m times.

91.102 - Computing II So: n calls to FindMax, each call to FindMax has n - m array comparisons. Total number of comparisons: ∑m = 0n-1 (n - m) = ∑m = 1n m = n*(n + 1)/2 = (1/2)n2 + (1/2)n  O(n2) What about the recursive calls, swaps, incrementations, termination checks, etc.?? They can each be counted as equivalent to a fixed number of array comparisons - so all we change is the coefficient of n2, and NOT the leading power...

91.102 - Computing II We can also use our "algebra of formulae": SelectionSort(A, 0, n) ~ f1(n)  O(n), if the dominant operation is the execution of the body of the if statement - which controls the recursive call. Within the if statement (during one recursive call), the dominant operation is given by FindMax(A, m, n) ~ f2(n)  O(n - m) = O(n), in terms of comparisons of array elements. The product:f1(n)*f2(n)  O(n2).

91.102 - Computing II Careful: These methods always lead to formulae and bounds that are useful “for large enough n” - i.e. for large enough data sets. Problem: when is the data set “large enough”? Problem: what happens with small data sets? Some of the algorithms that are good for large data sets are awfully complicated to code: when is the overhead introduced by the code complexity more than the gain in the asymptotic behavior?

91.102 - Computing II Careful: Sequential Search vs. Binary Search… Sequential Search is O(n) - every unsuccessful comparison reduces the set still to search by 1 item. Binary Search is O(log2(n)) - every unsuccessful comparison reduces the set still to search to 1/2 the previous size. The table says that for large n Binary Search is MUCH faster than Sequential Search. Is that true always? Even for “small” n? Why or Why Not? Interpolation Search is O(log2(log2(n))), which is even better than Binary Search. Why don’t we ALWAYS use it?

91.102 - Computing II Sequential Search: (array A need not be sorted, Key need not be ORDERABLE in any usual way) int SequentialSearch(Key K, SearchArray A) { int i; for(i = 0; i < n; ++i) { if (K == A[i]) return(i); } return(-1); } Try to analyze… Best Case: A[0]; Worst : A[n]; Average : A[n/2]...

91.102 - Computing II Binary Search: (array A is sorted in INCREASING order of Key) int BinarySearch(Key K, SearchArray A, int low,int high) { int mid = (low + high)/2; if (low > high) return -1; // nothing there else if (K == A[mid]) return(mid); // found it. else if (K < A[mid]) // look in left half return BinarySearch(K, A, low, mid - 1); else // look in right half return BinarySearch(K, A, mid + 1, high); }

91.102 - Computing II