340 likes | 363 Views
Learn how to analyze algorithms through empirical measurements and formal proofs. Understand complexities with linear and binary search algorithms. Discover graphical analysis methods and practical tips for code analysis. Dive into recursive programs and iterative searches, and grasp binary search analysis.
E N D
CSE 326: Data Structures Program Analysis Lecture 3: Friday, Jan 8, 2003
Outline • Empirical analysis of algorithms • Formal analysis of algorithms • Reading assignment: sec. 2.4.3 (maximum subsequence)
Determining the Complexity of an Algorithm • Empirical measurements: • pro: discover if constant factors are significant • con: may be running on “wrong” inputs • Formal analysis (proofs): • pro: no interference from implementation/hardware details • con: hides constants; may be hard In theory, theory is the same as practice, but in practice it is not
Measuring Empirical Complexity:Linear vs. Binary Search • Find a item in a sorted array of length N • Binary search algorithm:
int bfind(int x, int a[], int left, int right) { if (left+1 == right) return –1; m = (left + right) / 2; if (x == a[m]) return m; if (x < a[m]) return bfind(x, a, left, m); else return bfind(x, a, m, right); } int lfind(int x, int a[], int n) { if (n==0) return –1; if (x == a[n-1]) return n-1; return lfind(x, a, n-1); } for (i=0; i<n; i++) a[i] = i; for (i=0; i<n; i++) lfind(i,a,n); orbfind(i,a,-1,n)
slope 2 slope 1 Recall: we search n timesLinear = O(n2) Binary = O(n log n)
Property of Log/Log Plots • On a linear plot, a linear function is a straight line • On a log/log plot, any polynomial function is a straight line! slopey/ x = exponent Proof: suppose y = cxk log(y) = log(cxk) log(y) = log(c) + log(xk) log(y) = log(c) + k log(x) horizontal axis vertical axis slope
Empirical Complexity • Large data sets may be required to gain an accurate empirical picture • When running time is expected to be polynomial, use Log/log plots slope = exponent • When the running time is expected to be exponential, use log on the y axis • When running time is expected to be log, then use long on the x axis • Best: try all three, and see which one is linear
Analyzing Code • primitive operations • consecutive statements • function calls • conditionals • loops • recursive functions
Conditionals • Conditional if C then S1 else S2 • Suppose you are doing a O( ) analysis? Time(C) + Max(Time(S1),Time(S2)) or Time(C)+Time(S1)+Time(S2) • Suppose you are doing a ( ) analysis? Time(C) + Min(Time(S1),Time(S2)) or Time(C)
Nested Loops for i = 1 to n do for j = 1 to n do sum= sum+ 1
Nested Dependent Loops for i = 1 to n do for j = i to n do sum= sum+ 1
Nested Dependent Loops for i = 1 to n do for j = i to n do sum= sum+ 1 Compute itthe hard way: Compute it the smart way: substitute n - i+1 with j
Other Important Series • Sum of squares: • Sum of exponents: • Geometric series: • Novel series: • Reduce to known series, or prove inductively
Linear Search Analysis void lfind(int x, int a[], int n) { for (i=0; i<n; i++) if (a[i] == x) return i; return –1;} • Best case, tight analysis: • Worst case, tight analysis:
Iterated Linear Search Analysis for (i=0; i<n; i++) a[i] = i; for (i=0; i<n; i++) lfind(i,a,n); • Easy worst-case upper-bound: • Worst-case tight analysis:
Analyzing Recursive Programs • Express the running time T(n) as a recursive equation • Solve the recursive equation • For an upper-bound analysis, you can optionally simplify the equation to something larger • For a lower-bound analysis, you can optionally simplify the equation to something smaller
Binary Search int bfind(int x, int a[], int left, int right) { if (left+1 == right) return –1; m = (left + right) / 2; if (x == a[m]) return m; if (x < a[m]) return bfind(x, a, left, m); else return bfind(x, a, m, right); } What is the worst-case upper bound?
Binary Search int bfind(int x, int a[], int left, int right) { if (left+1 == right) return –1; m = (left + right) / 2; if (x == a[m]) return m; if (x < a[m]) return bfind(x, a, left, m); else return bfind(x, a, m, right); } Introduce some constants… b = time needed for base case c = time needed to get ready to do a recursive call Size is n = right-left Running time is thus:
Binary Search Analysis One sub-problem, half as large Equation: T(1) b T(n) T(n/2) + cfor n>1 Solution: T(n) T(n/2) + c write equation T(n/4) + c + c expand T(n/8) + c + c + c T(n/2k) + kc inductive leap T(1) + c log n where k = log n select value for k b + c log n = O(log n) simplify
Solving Recursive Equations by Telescoping • Create a set of equations, take their sum
Inductive Proof If you know the closed form solution,you can validate it by ordinary induction
E D C B A A F B C D E F Amortized Analysis Stack • Stack operations • push • pop • is_empty • Stack property: if x is on the stack before y is pushed, then x will be popped after y is popped • What is biggest problem with an array implementation?
int[] data; int maxsize; int top; Push(e){ if (top == maxsize){ temp = new int[2*maxsize]; for (i=0;i<maxsize;i++) temp[i]=data[i]; data = temp; maxsize = 2*maxsize; } data[++top] = e; } int pop() { return data[--top]; } Stretchy Stack Implementation Best case Push = O( ) Worst case Push = O( )
Stretchy Stack Amortized Analysis • Consider sequence of npush/pop operations • Amortized time = (T1 + T2 + . . . + Tn) / n • We compute this next push(e1) push(e2) pop() push(e3) push(e4) pop() . . . push(ek) time = T1 n time = Tn
Stretchy Stack Amortized Analysis • The length of the array increases like this: 1, 2, 4, 8, . . . , 2k, . . ., n • For each Ti we have one of the following • Ti = O(1) for pop(), and for some push(ei) • Ti = O(2k) for some push(ei) • Hence
Stretchy Stack Amortized Analysis Let’s compute this sum: And therefore: In an asymptotic sense, there is no overhead in using stretchy arraysrather than regular arrays!
Stretchy Stack Amortized Analysis • Careful ! We must be clever to get good amortized performance ! • Consider “smart pop”: int pop(){ int e = data[--top]; if (top <= maxsize/2){ maxsize = maxsize/2; temp = new int[maxsize]; for (i=0;i<maxsize;i++) temp[i]=data[i]; data = temp;} return e; }
Stretchy Stack Amortized Analysis • Take the sequence of 3n push/pop operations: push(e1) push(e2) ... push(en) pop() push(en) pop() push(en) pop() ... push(en) pop() n Suppose n = 2k+1 Hence amortized time is: T = ((1) + . . . + (1) + (n) + . . .+ (n))/3n = (n (1) + 2n (n))/3n = 2/3 (n) Hence T = (n) !!! 2n