1 / 48

CS 235102 Data Structures ( 資料結構 )

CS 235102 Data Structures ( 資料結構 ). Chapter 1: Basic Concepts Spring 2012. What is an algorithm?. An “algorithm” is a set of instructions that solves a well-defined computational problem . Algorithms must satisfy the following criteria:

Download Presentation

CS 235102 Data Structures ( 資料結構 )

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS 235102 Data Structures (資料結構) Chapter 1: Basic Concepts Spring 2012

  2. What is an algorithm? • An “algorithm” is a set of instructions that solves a well-defined computational problem. • Algorithms must satisfy the following criteria: • Input.Zero/more quantities are externally supplied. • Output. At least one quantity is produced. • Definiteness.Each instruction is clear and unambiguous. • Finiteness. Terminate after a finite no. of steps. • Effectiveness. Each instruction is basic and feasible to be computed easily.

  3. Describing an algorithm • Many ways to describe an algorithm: • Use English sentences • Graphic representations (called Flowcharts) • Work well only if algorithms are small and simple • Use C language (mixing with English sentences) • Our practice in our course

  4. Binary Search • Assume we have n ≥ 1 distinct integers that are sorted in array A[0 … n-1]. Determine if an integer x in the array. If x=A[j], return index j; otherwise return -1. A[0] A[1] A[2] A[3] A[4] A[5] A[6] A[7] • Eg. For x=9, return index 4; For x=10, return -1. 1 3 5 8 9 17 32 50 A

  5. Binary Search • Step 1. Find element A[mid] at middle position. • Step 2. If x = A[mid], then return mid; If x < A[mid], then search left-half; If x > A[mid], then search right-half;

  6. Algorithm of Binary Search int binsearch(int A[], int x) //search x in A[] { int left=0, right=n-1, mid; while (left <= right) { //more integers to check // let A[mid] be the middle element mid = (left+right)/2; if (x == A[mid]) return mid; if (x < A[mid]) right = mid-1; if (x > A[mid]) left = mid+1; } return -1; }

  7. Recursive Algorithm • Direct recursion • Procedure calls itself directly Eg.procA proc A • Indirect recursion • Procedure calls other procedures that invoke the calling procedure Eg. procA procB  procA • Some problem itself is defined recursively.

  8. Binomial Coefficients • Binomial coefficient C(n, m) = can be computed by the recursive formula: C(n, m) = C(n-1, m) + C(n-1, m-1) where C(0, 0) = C(n, n) = 1. n! m! (n-m)!

  9. Computing Binomial Coefficients // Compute binomial coefficient C(n,m) recursively. int bin_coeff(int n, m) { // termination conditions if (m==n) then return 1; else if (m==0) then return 1; // recursive step else return bin_coeff(n-1,m) + bin_coeff(n-1,m-1); }

  10. Hints for Recursive Algorithms • To write a recursive algorithm, someone must make sure to have • Termination conditions; • Parameter values decrease so that each call brings us one step closer to a solution.

  11. Recursive Binary Search int binsearch(int A[], x, left, right) { int mid; if (left <= right) { //more integers to check // let A[mid] be the middle element mid = (left+right)/2; if (x == A[mid]) return mid; if (x < A[mid]) return binsearch(A, x, left, mid-1); if (x > A[mid]) return binsearch(A, x, mid+1, right); } return -1; }

  12. A Running Example • Search for x=9 in array A[0…7] : A[0] A[1] A[2] A[3] A[4] A[5] A[6] A[7] • 1st call: binsearch(A, 9, 0, 7) 2nd call: binsearch(A, 9, 4, 7) 3rd call: binsearch(A, 9, 4, 4) return index 4. 1st 3rd 2nd 1 3 5 8 9 17 32 50 A

  13. Criteria for Good Programs • Many criteria to judge a program: • Meeting the original task specification? • Does it work correctly? • Documentation for using the program? • Modularity • Code readability • Although the above criteria are vitally important, it is difficult to achieve them. • In order to achieve them, many real experience and practice are needed.

  14. Performance Analysis • Two criteria for performance analysis/evaluation: • How much memory space is needed? • How much running time is needed? • Performance analysisvs.performance measurement • Performance analysis: • machine independent • a prior estimate • heart of “complexity theory” • Performance measurement: • machine dependent • a posterior testing

  15. Performance Analysis • Performance Analysis contains two parts: Space Complexity and Time Complexity. • Space Complexity includes also two parts: • 1st part -- A fixed part: • independent of the no. and the size of input and outputs. • includes instruction space, space for simple variables, fixed-size structured variables, constants. • 2nd part -- A variable part: (see next slide …)

  16. Performance Analysis • Space Complexity includes also two parts: • 2nd part -- A variable part: • depends on the particular problem instance I being solved. • includes recursion stack space, structured variable whose size depends on the particular instance I. • Thus Space Complexity S(P) = 1st part + 2nd part = C + SP(I) constant Concentrate on evaluate this term

  17. Instance Characteristic (I) • Commonly used characteristics (I) include the number, size and values of the inputs and output. also two parts: • Eg. sorting(A[], n) Then I= number of integers = n. • Eg. Summing 1 to n, i.e., 1+2+3+… n Then I= value of n = n.

  18. Space Complexity • Eg. (See Program 1.10 in the textbook.) • C= space for the program + • space for variables a, b, c, abc • = constant • SP(I) = 0 • Thus S(P) = C + SP(I) = constant. float abc(float a, b, c) { return a+b+b*c+(a+b-c)/(a+b)+4.00; }

  19. Iterative Summing (Program 1.11) // Compute the sum of n numbers in A[] iteratively. float sum(float A[], int n) {float tempsum = 0; int i; for (i=0; i<n; i++) tempsum += A[i]; return tempsum; } • In C: -- Pass an array by reference only. • -- Pass other parameters by value. • In PASCAL: -- All parameters can be passed by reference or value.

  20. Iterative Summing (Program 1.11) • Instance characteristic (I) = n(=size of array A[]) • If passed by reference (eg. in C) • Ssum(I) = Ssum(n) = 0 • If passed by value, • Ssum(I) = Ssum(n) = n

  21. Recursive Summing (Program 1.12) // Compute the sum of n numbers in A[] recursively. float rsum(float A[], int n) { if (n) return rsum(A, n-1) + A[n-1]; return A[0]; } • Instance characteristic (I) = n • Each call requires • 4 ∙ ( 1 + 1 + 1) = 12 bytes • How many calls (recursive depth): • rsum(A, n)  rsum(A,n-1)  …  rsum(A, 0) • ==> n+1 calls • Srsum(I) = Srsum(n) = 12 ∙ (n+1)

  22. Time Complexity • Time taken by program P: T(P) = Compile time + Running time = TC+ TP(I) • How to evaluate TP(I) ?? • Add, Sub, Multiply, … take different running time • Use “program step” to estimate running time • “program step” = a program segment whose execution time is independent of instance characteristic I. • Eg. abc=a+b+b*c; -- one program step a=2; -- one program step constant

  23. Iterative Summing (Program 1.11) // Compute the sum of n numbers in A[] iteratively. float sum(float A[], int n) {float tempsum = 0; // 1 step int i; for (i=0; i<n; i++) // n+1 steps tempsum += A[i]; // n steps return tempsum; // 1 step } • Instance charateric (I) = n (=size of array A[]) • Tsum(I) = Tsum(n) = 1 + (n+1) + n + 1 • =2n + 3

  24. Recursive Summing (Program 1.12) // Compute the sum of n numbers in A[] recursively. float rsum(float A[], int n) { if (n) // 1 step return rsum(A, n-1) + A[n-1]; // 1 step return A[0]; // 1 step } • Instance characteristic (I) = n • Recurrence relation for Trsum(n) : • Trsum(0) = 2 • Trsum(n) = 2 +Trsum(n-1) • = 2 +( 2 +Trsum(n-2) ) • = … • = 2n + Trsum(0) = 2n + 2

  25. Matrix Addition (Program 1.16) // Compute the sum of n numbers in A[] iteratively. void add(int a[][MAX_SIZE], b[][MAX_SIZE], c[][MAX_SIZE], int rows, int cols) { int i, j ; for (i=0; i<rows; i++) { //rows+1 steps for (j=0; j<cols; j++) //rows(cols+1) steps c[i][j] = a[i][j]+b[i][j]; //rows(cols) steps } } • Instance charateric (I) = rows(cols) • TP(rows,cols) = (rows+1) + rows(cols+1) + rows(cols) • = 2∙rows∙cols + 2∙rows + 1

  26. Observation on Step Counts • In the previous examples : • Tsum(n) = 2n + 3 steps • Trsum(n) = 2n + 2 steps • Can we say that rsum is faster than sum ? • No. Since the execution time of each step is different. • The step count is useful in that • “How the running time changes with changes in the instance characteristic?”

  27. Growth Rate of Time Complexity • Eg. For sum program, Tsum(n) = 2n + 3 “means”: • when n  10 fold ==>Tsum(n)  10 fold • sum program runs in linear time. • We only want to know the growth rate (called asymptotic time) • Eg. Tsum(n) = 2n + 3 vs. Trsum(n) = 2n + 2 Then clearly Tsum(n) and Trsum(n) have the same growth rate.

  28. Asymptotic Notation (Big-O) • To compare the time complexity of two programs that compute the same function. • To predict the growth rate in run time as the instance characteristic increases. • Eg. Two programs with time complexity: P1: c1 n2 + c2 n P2: c3 n • Let c1 =1, c2 =2, and c3 =100. Then • P1 faster, n2 + 2n ≤ 100n for n ≤ 98 • P2 faster, n2 + 2n > 100n for n > 98

  29. Asymptotic Notation (Big-O) • Eg. Two programs with time complexity: P1: c1 n2 + c2 n P2: c3 n • Let c1 =1, c2 =2, and c3 =1000. Then • P1 faster, n2 + 2n ≤ 100n for n ≤ 998 • P2 faster, n2 + 2n > 100n for n > 998 • No matter what values c1, c2, and c3 are, there will be n beyond which c1 n2 + c2 n > c3 n

  30. Definition of Big-O • Definition of O-notation: f(n) = O(g(n)) iff these exist c, no>0 such that f(n) ≤ cg(n) for all n ≥ no . • Eg. 3n + 2 = O(n) since 3n+2 ≤ 4n for all n ≥ 2 • Eg. 100n + 6 = O(n) since 100n+6 ≤ 101 n for all n ≥ 10 • Eg. 10n2 + 4n + 2 = O(n2) since 10n2 + 4n + 2 ≤ 11 n2for all n ≥5

  31. Observation of Big-O • Leading constants and lower-order terms do not matter. • Can always find a constant large enough to make higher-order term swamp other terms. • Thm 1.2: If f(n) = amnm + … + a1n + ao, then f(n) = O(nm). • Eg. 10n2 + 4n + 2 = O(n2)

  32. Property of Big-O • Thm 1.2: If f(n) = amnm + … + a1n + ao, then f(n) = O(nm). • Pf. f(n) ≤ |am| nm + … + |a1| n + |ao| ≤ nm (|am| + … + |a1| + |ao|) ≤ cnm , wherec=|am| + … + |a1| + |ao| for n ≥ 1 Hence f(n) = O(nm).

  33. More Examples • 0.1 n2- 10n - 6 = O(n2) • n + log n = O(n) • n + n log n = O(n log n) • n2 + log n = O(n2) • 2n + n10000 = O(2n) • n4+ 1000 n3 + n2 = O(n4) • n4+ 1000 n3 + n2 = O(n5) • n4+ 1000 n3 + n2 ≠ O(n3) always 1 We don’t write O(2n2)

  34. Naming Common Functions • O(1) -- constant time • O(log n) -- logarithmic time • O(n) -- linear time • O(n2) -- quadratic time • O(n3) -- cubic time • O(n100) -- polynomial time • O(2n) -- exponential time When n is large enough, the latter terms take more time than the former ones.

  35. More Notes on Big-O • f(n) =O(g(n)) “means” g(n) is an upper bound of f(n) • n = O(n) n = O(n2) n = O(n3) • Big-O is usually used to compute the worst-case running time of a program. • f(n) =O(g(n)) is correct; but O(g(n)) =f(n)is wrong. We want g(n) as small as possible !!

  36. Compute Running Time in Big-O • How to compute the time complexity of a program in big-Oh ? • Compute the total step-count, then take big-Oh. • Take big-Oh on each step, then sum up the big-Oh of all steps.

  37. Rule of Sum • Thm: If f1(n) = O(g1(n)), and f2(n)=O(g2(n)), then f1(n) + f2(n) = O(max(g1(n), g2(n)). • Eg. f1(n) = O(n) f2(n) = O(n2) Then f1(n) + f2(n) = O(n2). • Eg. f1(n) = O(n) f2(2) = O(n) Then f1(n) + f2(n) = O(n). • Used to compute segments of program P1 and P2 if P1 is followed by P2.

  38. Rule of Product • Thm: If f1(n) = O(g1(n)), and f2(n)=O(g2(n)), then f1(n) ∙ f2(n) = O(g1(n)∙ g2(n)). • Eg. f1(n) = O(n) f2(n) = O(n) Then f1(n) ∙ f2(n) = O(n2). • Used in time analysis of nested loops (See next slide …)

  39. Rule of Product for (i=0; i<n; i++) { // O(n) for (j=0; j<n; j++) // O(n) sum := sum + 1; // O(1) } • By rule of product, running time of this program: • f(n) = O(n ∙ n∙ 1) • = O(n2).

  40. Complexity of Binary Search int binsearch(int A[], int x) //search x in A[] { int left=0, right=n-1, mid; while (left <= right) { // O(log2 n) // let A[mid] be the middle element mid = (left+right)/2; // O(1) if (x == A[mid]) return mid; // O(1) if (x < A[mid]) right = mid-1; // O(1) if (x > A[mid]) left = mid+1; // O(1) } return -1; // O(1) }

  41. Complexity of Binary Search • Analysis of the while loop: • Iteration 1: n values to be searched • Iteration 2: n/2 left for searching • Iteration 3: n/4 left for searching • … • Iteraton k+1: n/(2k) left for searching When n/(2k) = 1, searching must finish. That is n = 2k ==> k = log2 n • Hence, worst-case running time of binary search is O(log2 n).

  42. Definition of Big-Ω • Definition of Ω-notation: f(n) = Ω(g(n)) iff these exist c, no>0 such that f(n) ≥ cg(n) for all n ≥ no . • Eg. 3n + 2 = Ω(n) since 3n+2 ≥ 3n for all n ≥ 1 • Eg. 100n + 6 = Ω(n) since 100n+6 ≥ 100 n for all n ≥ 1 • Eg. 10n2 + 4n + 2 = Ω(n2) since 10n2 + 4n + 2 ≥ n2for all n ≥1

  43. Definition of Big-Θ • Definition of Θ-notation: f(n) = Θ(g(n)) iff f(n) = O(g(n)) and f(n) = Ω(g(n)). • Eg. 3n + 2 = Θ(n) • Eg. 100n + 6 = Θ(n) • Eg. 10n2 + 4n + 2 = Θ(n2)

  44. Plot of Common Function Values

  45. Running Times On Computer

  46. Performance Measurement • Obtain actual space and time requirement when running a program. • How to do time measurement in C ? • Method 1:Use clock(), measured in clock ticks • (See next slides for details …) • Method 2:Use time(), measured in seconds • (See next slides for details …) • To time a short event, it is necessary to repeat it many times, and then take their average.

  47. Performance Measurement • Method 1: Use clock(), measured in clock ticks #include <time.h> void main() { clock_t start = clock(); // main body of program comes here! clock_t stop = clock(); double duration = ((double) (stop-start)) / CLOCKS_PER_SEC; }

  48. Performance Measurement • Method 2: Use time(), measured in seconds #include <time.h> void main() { time_t start = time(NULL); // main body of program comes here! time_t stop = time(NULL); double duration = (double) difftime(stop,start); }

More Related