410 likes | 597 Views
Recursion. Algorithm : Design & Analysis [3]. In the last class…. Asymptotic growth rate The Sets , and Complexity Class An Example: Searching an Ordered Array Improved Sequential Search Binary Search Binary Search Is Optimal. Recursion. Recursive Procedures
E N D
Recursion Algorithm : Design & Analysis [3]
In the last class… • Asymptotic growth rate • The Sets , and • Complexity Class • An Example: Searching an Ordered Array • Improved Sequential Search • Binary Search • Binary Search Is Optimal
Recursion • Recursive Procedures • Proving Correctness of Recursive Procedures • Deriving recurrence equations • Solution of the Recurrence equations • Guess and proving • Recursion tree • Master theorem • Divide-and-conquer
Thinking Recursively • Towers of Hanoi • How many moves are need to move all the disks to the third peg by moving only one at a time and never placing a disk on top of a smaller one.
Recursion: an Implementation View • Activation Frame • Basic unit of storage for an individual procedure invocation at run time • Providing a frame of reference in which the procedure executes for this invocation only • Space is allocated in a region of storage called the frame stack • Activation trace
Recursion Tree fib n: 3 f1:1 f2:1 f: 2 int fib(int n) int f, f1, f2; if (n<2) f=n; else f1=fib(n-1); f2=fib(n-2); f=f1+f2; return f; fib n: 2 f1:1 f2:0 f: 1 fib n: 1 f1: f2: f: 1 fib n: 1 f1: f2: f: 1 fib n: 0 f1: f2: f: 0 Activation Tree Activation frame creation is in preorder The cost for the simplicity: inefficiency: The number of recursive calls for computing Fib(31): ??? 2692537
Loop-free Complexity • In a computation without while or for loops, but possibly with recursive procedure calls, • The time that any particular activation frame is on the top of the frame stack is O(L), where L is the number of lines in the procedure that contain either a simple statement or a procedure call. (Note: L is a constant) • The total computation time is (C), where C is the total number of procedure calls that occur during the computation. (Note: for an algorithm, max{Li} is a constant.)
2-Tree Common Binary Tree 2-Tree internal nodes Both left and right children of these nodes are empty tree external nodes no child any type
External Path Length(EPL) • The EPL of a 2-treet is defined as follows: • [Base case] 0 for a single external node • [Recursion] t is non-leaf with sub-trees L and R, then the sum of: • the external path length of L; • the number of external node of L; • the external path length of R; • the number of external node of R; • In fact, the external path length of t is the sum of the lengths of all the paths from the root of t to any external node in t.
Calculating the External Path Length EplReturn calcEpl(TwoTree t) EplReturn ansL, ansR; EplReturn ans=new EplReturn(); 1. if (t is a leaf) 2. ans.epl=0; ans.extNum=1; 3. else 4. ansL=calcEpl(leftSubtree(t)); 5. ansR=calcEpl(rightSubtree(t)); 6. ans.epl=ansL.epl+ansR.epl+ansL.extNum +ansR.extNum; 7. ans.extNum=ansL.extNum+ansR.extNum+1 8. Return ans; TwoTree is an ADT defined for 2-tree EplReturn is a organizer class with two field epl and extNum
Correctness of Procedure calcEpl • Let t be any 2-tree. Let epl and m be the values of the fields epl and extNum, respectively, as returned by calcEpl(t). Then: • 1. epl is the external path length of t. • 2. m is the number of external nodes in t. • 3. eplmlg(m) (note: for 2-tree with internal n nodes, m=n+1)
Proof on Procedure calcEpl • Induction on t, with the “sub-tree” partial order: • Base case: t is a leaf. (line 2) • Inductive hypothesis: the 3 statements hold for any proper subtree of t, say s. • Inductive case: by ind. hyp., eplL, eplR, mL, mR,are expected results for L and R(both are proper subtrees of t), so: • Statement 1 is guranteed by line 6 • Statement 2 is guranteed by line 7 (any external node is in either L or R) • Statement 3: by ind.hyp. epl=eplL+eplR+mmLlg(mL)+mRlg(mR)+m, note f(x)+f(y)2f((x+y)/2) if f is convex, and xlgxis convex for x>0, so, epl 2((mL+mR)/2)lg((mL+mR)/2)+m = m(lg(m)-1)+m =mlgm.
Binary Search: Revisited int binarySearch(int[] E, int first, int last, int K) 1. if (last<first) 2. index=-1; 3. else 4. int mid=(first+last)/2 5. if (K==E[mid]) 6. index=mid; 7. else if (K<E[mid]) 8. index=binarySearch(E, first, mid-1, K) 9. else if (K<E[mid]) 10. index=binarySearch(E, mid+1, last, K) 11.return index;
Binary Search: Correctness • Correctness of binarySearch: • For all n0, if binarySearch(E, first, last, K) is called, and if: • the problem size is (last-first+1)=n • E[first], …, E[last] are in non-decreasing order • then: • it returns -1 if K does not occur in E within the range first, …, last • it returns index if K=E[index]
Binary Search: Proof of Correctness • Induction on n, the problem size. • Base case: n=0, line 2 executed, returning -1. • For n>0(that is: firstlast), assume that binarySearch(E,f,l,K) satisfies the correctness on problems of size k(k=0,1,2,…n-1), and k=l-f+1. • mid=(first+last)/2 must be within the search range, so: • if line 5 is true, line 6 executed. • otherwise, the induction hypothesis applies for both recursive calls, on line 8 and 10. What we have to do is to verify that the preconditions for the procedure hold, and return value of the recursive call satisfies the post-conditions of the problem.
Proving Binary Search (cont.) • Case that line 5 is false, i.e. KE[mid] • Since firstmidlast, we have • (mid-1)-first+1n-1 • Last-(mid+1)+1n-1 • So, the inductive hypothesis applies for both recursive calls on line 8 and 10 • For both calls, only one actual parameter is changed and decreased, so, line 8 or line 10 will return the appropriate results
Recurrence Equation: Concept • A recurrence equation: • defines a function over the natural number n • in term of its own value at one or more integers smaller than n • Example: Fibonacci numbers • Fn=Fn-1+Fn-2 for n2 • F0=0, F1=1 • Recurrence equation is used to express the cost of recursive procedures.
seqSearchRec(E,m,num,K) 1. if (mnum) 2. ans=-1 3. else if (E[m]==K) 4. ans=m 5. else 6. ans = seqSearchRec (E,m+1,num,K) 7. return ans “cost” is defined as the number of comparison of array element. For simple statement, only line 3 costs 1. The result: T(n)=(0+max(0, 1+max(0, T(n-1))))+0 = T(n-1)+1 Recurrence Equation for Sequential Search 2 4 7 1 3 6
Recurrence Equation for Binary Search int binarySearch(int[] E, int first, int last, int K) if (last<first) index=-1; else int mid=(first+last)/2 if (K= =E[mid]) index=mid; else if (K<E[mid]) index=binarySearch(E, first, mid-1, K) else if (K>E[mid]) index=binarySearch(E, mid+1, last, K) return index; the only nonrecursive cost the two alternative recursions with the max input size of n/2 or (n-1)/2 The equation: T(n) = T(n/2)+1
Guess the Solutions • Example: T(n)=2T(n/2) +n • Guess • T(n)O(n)? • T(n)cn, to be proved for c large enough • T(n)O(n2)? • T(n)cn2, to be proved for c large enough • Or maybe, T(n)O(nlogn)? • T(n)cnlogn, to be proved for c large enough Try to prove T(n)cn: T(n)=2T(n/2)+n 2c(n/2)+n 2c(n/2)+n = (c+1)n, Fail! T(n) = 2T(n/2)+n 2(cn/2 lg (n/2))+n c lg (n/2)+n = cn lg n – cn log 2 +n = cn lg n – cn + n cn log n for c1 However: T(n) = 2T(n/2)+n 2cn/2+n 2c[(n-1)/2]+n = cn+(n-c) cn Note: the proof is invalid for T(1)=1
Decrease for Larger • T(n)=T(n/2)+T(n/2)+1 • It’s difficult to prove T(n)cn directly. However, we can prove T(n)cn-b for some b>0. • T(n) = T(n/2)+T(n/2)+1 (cn/2-b)+ (cn/2-b)+1 = cn-2b+1 which is no larger than cn for any b1
What’s Wrong? • We have proved that T(n)=2T(n/2)+n has a tight lower bound of O(nlgn). • But: if we assume that T(n/2)cn/2 • T(n) = 2T(n/2)+n cn+n = O(n) • What’s wrong?
T(n/4) T(n/2) T(n/4) T(n/4) T(n/4) T(n/2) n/4 n/4 n/2 n/4 n/4 n/2 Recursion Tree T(size) nonrecursive cost T(n) n The recursion tree for T(n)=T(n/2)+T(n/2)+n
Recursion Tree Rules • Construction of a recursion tree • work copy: use auxiliary variable • root node • expansion of a node: • recursive parts: children • nonrecursive parts: nonrecursive cost • the node with base-case size
Recursion tree equation • For any subtree of the recursion tree, • size field of root = Σnonrecursive costs of expanded nodes + Σsize fields of incomplete nodes • Example: divide-and-conquer: T(n) = bT(n/c) + f(n) • After kth expansion:
Evaluation of a Recursion Tree • Computing the sum of the nonrecursive costs of all nodes. • Level by level through the tree down. • Knowledge of the maximum depth of the recursion tree, that is the depth at which the size parameter reduce to a base case.
T(n/4) T(n/2) T(n/4) T(n/4) T(n/4) T(n/2) n/4 n/2 n/4 n/4 n/2 n/4 Recursion Tree Work copy: T(k)=T(k/2)+T(k/2)+k T(n)=nlgn T(n) n n/2d (size 1) At this level: T(n)=n+2(n/2)+4T(n/4)=2n+4T(n/4)
Recursion Tree for T(n)=3T(n/4)+(n2) cn2 cn2 c(¼ n)2 c(¼ n)2 c(¼ n)2 log4n c((1/16)n)2 c((1/16)n)2 …… …… c((1/16)n)2 c((1/16)n)2 c((1/16)n)2 …… …… … T(1) T(1) T(1) T(1) T(1) T(1) T(1) T(1) T(1) T(1) T(1) T(1) T(1) T(1) Note: Total: (n2)
Verifying “Guess” by Recursive Tree Inductive hypothesis
Recursion Tree for T(n)=bT(n/c)+f(n) f(n) f(n) b f(n/c) f(n/c) f(n/c) b logcn f(n/c2) f(n/c2) f(n/c2) f(n/c2) f(n/c2) f(n/c2) f(n/c2) f(n/c2) f(n/c2) …… …… … T(1) T(1) T(1) T(1) T(1) T(1) T(1) T(1) T(1) T(1) T(1) T(1) T(1) Note: Total ?
Solving the Divide-and-Conquer • The recursion equation for divide-and-conquer, the general case:T(n)=bT(n/c)+f(n) • Observations: • Let base-cases occur at depth D(leaf), then n/cD=1, that is D=lg(n)/lg(c) • Let the number of leaves of the tree be L, then L=bD, that is L=b(lg(n)/lg(c)). • By a little algebra: L=nE, where E=lg(b)/lg(c), called critical exponent.
Divide-and-Conquer: the Solution • The recursion tree has depth D=lg(n)/ lg(c), so there are about that many row-sums. • The 0th row-sum is f(n), the nonrecursive cost of the root. • The Dth row-sum is nE, assuming base cases cost 1, or (nE) in any event. • The solution of divide-and-conquer equation is the nonrecursive costs of all nodes in the tree, which is the sum of the row-sums.
Solution by Row-sums • [Little Master Theorem] Row-sums decide the solution of the equation for divide-and-conquer: • Increasing geometric series: T(n)(nE) • Constant: T(n) (f(n) log n) • Decreasing geometric series: T(n) (f(n)) This can be generalized to get a result not using explicitly row-sums.
The positive is critical, resulting gaps between cases as well Master Theorem • Loosening the restrictions on f(n) • Case 1: f(n)O(nE-), (>0), then: T(n)(nE) • Case 2: f(n)(nE), as all node depth contribute about equally: T(n)(f(n)log(n)) • case 3: f(n)(nE+), (>0), and f(n)(nE+), (), then: T(n)(f(n))
Looking at the Gap • T(n)=2T(n/2)+nlgn • a=2, b=2, E=1, f(n)=nlgn • We have f(n)=(nE), but no >0 satisfies f(n)=(nE+), since lgn grows slower that n for any small positive . • So, case 3 doesn’t apply. • However, neither case 2 applies.
Proof of the Master Theorem: Case 3 (Note: in asymptotic analysis, f(n)(nE+) leads to f(n) is about (nE+), ignoring the coefficients. Decreasing geometric series
T(n) f(n) Chip and Conquer The equation: T(n) = bT(n-c)+f(n) As an example: Hanoi Tower, with b=2, c=1 b branches T(n-c) f(n-c) T(n-c) f(n-c) T(n-c) f(n-c) ...... totalling n/c levels ...... ...... T(n-2c) f(n-2c)
Solution of Chip and Conquer grows exponentially in n
Home Assignment • pp.143- • 3.3-3.4 • 3.6 • 3.9-3.10