190 likes | 314 Views
Analysis. CS 367 – Introduction to Data Structures. Overview. Why study data structures? biggest reason for a computer is to store, analyze, and retrieve data data structures are used to store data type of data structure chosen effects amount of data that can be stored retrieval speed
E N D
Analysis CS 367 – Introduction to Data Structures
Overview • Why study data structures? • biggest reason for a computer is to store, analyze, and retrieve data • data structures are used to store data • type of data structure chosen effects • amount of data that can be stored • retrieval speed • speed of analysis
Data Structures • Data structure selection involves tradeoffs • speed vs. capacity vs. flexibility vs. complexity • Consider an array versus a vector • which is faster? • which has a larger storage capacity? • which has more flexibility in data placement? • which is more complex? • which one is better? • Must always consider which characteristics are most important for the data being handled
Algorithms • Always do some kind of work on data • at least sort or search through it • Many different routes to same destination • consider 10 different people writing the same program • Picking the best route requires analysis • most concerned with how long an algorithm will take
Analysis • Computation Complexity • how much effort and cost is associated with a specific algorithm • How do we measure computational complexity? • run algorithm on a set of data and time it? • results depend on speed of computer used • a PC vs. a Cray supercomputer • results depend on language algorithm written in • C vs. Java vs. Basic
Analysis • Need a mathematical technique to analyze algorithms • needs to be free of any real-time units (ms) • should relate logical time unit to data size • t = cn t->logical time, c->constant, n->data size • notice that time does not need to given in seconds • Several different techniques have been suggested • we will only consider a few of them
Big-O Notation • Formal, mathematical definition f(n) is O(g(n)) if there exist positive numbers c and N such that f(n) ≤ cg(n) for all n ≥ N. • Wow! That probably makes no sense at all • Now let’s put it into English • we want to find some function (g(n)) that is always greater than or equal to the original function (f(n)) • for small values of f(n), g(n) may not be equal or greater • However, once n reaches a certain size, then g(n) will always be greater than or equal to f(n)
Big-O Notation • Still confused? Don’t feel bad, let’s see an example • consider the following function: f(n) = n2 + 2n + 35 • now consider this next function: • g(n) = n2 • by the previous definition, we need to find a c and an N that satisfy the following: • cg(n) ≥ f(n) for all n ≥ N
Big-O Notation • So let’s find a c and an N • if c=2 then for what values of n is cg(n) ≥ f(n)? • solve the following equation for n: • cn2 ≥ n2 + 2n + 35 • n ≥ 7 (this means N = 7 from previous definition) • it is clear that there are an infinite number of c and N possibilities • what’s important, is g(n) is the same for all of them • but there’s an infinite number of possible g(n), too • g(n) = n2, n3, n4, … • pick the smallest g(n) that works
Big-O Notation • The previous slide showed that g(n) = n2 • Hence, our Big-O notation is O(n2) for the function f(n) = n2 + 2n + 35 • Notice that all but the first term of f(n) were discarded to find g(n) • this works because for large n, the largest exponent dominates • consider n=1, n=5, n=100
Analysis • So how does all this math help algorithm analysis? • consider the following piece of code for(i=sum=0; i<n; i++) sum += a[i]; • before loop starts, 2 assignments (i and sum) • 2 assignments within the loop (sum and i) • equation relating time to n is: f(n) = 2 + 2n • computational complexity is: O(n)
Analysis • Consider another example int rowsum[n]; int firstcolsum; for(i=firstcolsum=0; i<n; i++) { firstcolsum += a2d[i][0]; for(j=rowsum[i]=0; j<n; j++) rowsum[i] += a2d[i][j]; } • before loop starts: 2 assignments • inside outer loop: 4 assignments • inside inner loop: 2 assignments • equation: f(n) = 2 + 4n + 2n*n = 2 + 4n + 2n2 • computational complexity: O(n2)
Analysis • Analysis can get much more sophisticated • Let’s compare 2 array search algorithms • brute force • check the first element, then second, third, etc. • binary search • requires a sorted array • start at the center and divide the array into two • if less than center, search lower half • if greater than center, search upper half • repeat until the number is found
Brute Force Analysis • Code int bruteforce(int [] array, int key) { for(int i=0; i<array.length; i++) { if(array[i] == key) return i; } return -1; • Analysis • best case: iterations = 1 • worst case: iterations = n • average case: iterations = n/2 • average computational complexity = O(n)
Binary Search Analysis • Code int binarySearch(int [] array, int key) { int lo = 0, hi = array.length – 1, mid; while(lo <= hi) { mid = (lo + hi) / 2; if(key < array[mid]) hi = mid – 1; if(key > array[mid]) lo = mid + 1; else return mid; } return -1;
Binary Search Analysis • Analysis • best case: iterations = 1 • worst case: iterations = log2 n • average case: give equation • don’t worry about how this was achieved • see the book if you want details • it should be clear that calculating complexity is not always easy or straightforward • average computational complexity: O(log2 n)
Big-Ω Notation • Formal, mathematical definition f(n) is Ω(g(n)) if there exist positive numbers c and N such that f(n) ≥ cg(n) for all n ≥ N. • only real difference between Big-Ω and Big-O is that Big-O is an upper bound on a function and Big-Ω is a lower bound • all the descriptions of Big-O apply – just reverse the “≤” sign in Big-O to a “≥” sign for Big- Ω
Big-Θ Notation • Formal, mathematical definition f(n) is Θ(g(n)) if there exist positive numbers c1, c2, and N such that c1g(n) ≤f(n) ≤ c2g(n) for all n ≥ N. • Again, we consider the following equation • f(n) = n2 + 2n + 35 • We have shown that this is O(n2) • It is easy to show that it is also Ω(n2) • if c=1 and g(n)=n2 then cg(n) ≤ f(n) • If c1=1, c2=2, and N=7, we satisfy the above definition and the computational complexity (using Big-Θ) becomes Θ(n2)
Words of Warning • There are many other, more sophisticated methods of computational complexity • Those shown here do not always paint a complete picture • However, these techniques can provide some good insight into an algorithm