400 likes | 748 Views
CS2006 - Data Structures I. Chapter 10 Algorithm Efficiency & Sorting I. Topics. Algorithm analysis & efficiency Big-O Searching Linear search Binary search. Introduction. Important Questions we ask ourselves about algorithm How to measure the cost of a program?
E N D
CS2006 - Data Structures I Chapter 10 Algorithm Efficiency & Sorting I
Topics • Algorithm analysis & efficiency • Big-O • Searching • Linear search • Binary search
Introduction • Important Questions we ask ourselves about algorithm • How to measure the cost of a program? • How fast does an algorithm run? • Is this algorithm faster than that one?
Introduction • Cost of computer programs: • It is not always possible to minimize both time & space requirements • How to measure the efficiency? • What will affect the time used by the program?
Introduction • Two approaches to measure the efficiency • Analysis • Usually done with algorithm • Count the number of elementary operations • Assume all these operations take the same amount of time • Benchmarking • How?
Introduction • Algorithm analysis: • Focuses on significant differences in algorithm efficiency, that are likely to dominate the overall cost of a solution • Should be independent of specific implementations, computers, & data • Since none of the algorithms covered in this course has significant space requirements, we will focus on time efficiency
Introduction • Big-O: • Stands for “Order of Magnitude” • A theoretical measure of execution of algorithms • Only an approximation • Smaller factors are usually ignored • Given: • Problem size N (usually measured by the number of items processed) • Required: • Amount of resources consumed (usually time and/or memory)
Introduction • Big-O Examples: • Sequential searching of a list: • Average time depends on the number of elements in the list.
Linear Search boolean linearSearch (int key, int data [], int n) { for(int i = 0 ; i < n ; i++) if (data[I]==key) return true; } • Involves analysis of the number of loop iterations • Operations: • Comparison of I to n • Increment I • Compare array value to key • Total 3
Linear Search • Number of operations: • Worse case: n*3 • Best case: 3 • Average case: (n*3)/2
Big - O Notation • Full definition of Big-O notation: • Let f and g be functions on the non-negative integers. • Then f(n) is of order at most g(n) : • denoted f(n) = O(g(n)) • This means that there exists a positive constant C such that: • |f(n)| <= C(g(n)) • for all sufficiently large positive integers n. For all n>k
Efficiency • Example: Linked List Traversal Node cur = head; 1 assignment (a) while ( cur != NULL ) n+1 comparisons (c) { System.out.print(cur.item); n writes (w) cur = cur.next; n assignments (a) } / / end while • The statement requires the following time units: (n+1) * (a + c) + n * w = (a+c) +(a+c+w) * n proportional to n • Time required to write N nodes is proportional to N • We only need an approximate count of the instructions executed
Efficiency • Example: Find average of N numbers Sum = 0.0; // 1 time assignment i = 0; // 1 time assignment while ( i < n) // n+1 times comparison { sum += a[ i ]; // n times assignment i += 1; // n times assignment } ave = sum / i; // 1 time assignment • Assumption: • Time needed by both assignment & comparison is the same • Efficiency ~ 3n+4 ~ O(n) for large n
Efficiency • Big O Notation (Order of magnitude) • If an algorithm A requires time proportional to g(n), we say that the algorithm is of order g(n) , or O(g(n)) • g(n) is called the algorithm's growth-rate function • Usually only large values of n >= n0 are considered
Algorithm A (n 2 / 5) Seconds Algorithm B (5 * n) 25 N Efficiency • Big O Notation • Time requirement as function of problem size n • For n < 25, A is better • For n> 25, B is better
Algorithm A (3 * n 2) Seconds Algorithm B ( n 2 – 3 * n * 10) 1 2 3 N Efficiency • Big O Notation • Example: • For n ≥ 2, algorithm A is better • For n > 2, algorithm B is better
Efficiency • Big O Notation • Properties of Growth-Rate Functions • Ignore low-order terms O(n3 + 4 * n2 + 3 * n) O(n3 ) • Ignore multiplication constant in the high-order term of an algorithm's growth-rate function O( 5 * n3 ) O(n3 ) • Combine growth-rate functions O( f(n) ) + O( g(n) ) O( f(n) + g(n) ) O ( n3 ) + O ( n ) O ( n3 + n ) O ( n3 ) O ( n Log n + n ) O ( n ( log n + 1 )) O ( n log n )
Efficiency • Common Growth Rate Functions • Constant:1 • Constant time requirements • Time independent of problem’s size • Logarithmic: log2n • Time increases slowly with problem’s size • Time doubles when size is squared • The base of the log doesn’t affect growth rate • Example?
Efficiency • Common Growth Rate Functions • Linear : n • Time increases directly with problem’s size • Time squares when size is squared • n log n: n * log2n • Time increases rapidly than the linear • Typical algorithms divide the problem into smaller problems that are solved separately • Examples: Merge-Sort • Quadratic: n2 • Time increases rapidly with problem’s size • Algorithms use two nested loops • Practical for small problems only • Examples: Selection sort
Efficiency • Common Growth Rate Functions • Cubic: n3 • Time increases more rapidly with problem’s size • Practical for small problems only • Exponential: 2n • Time increases too rapidly with problem’s size • Impractical
Efficiency • Order of growth of some common functions O(1) < O( log2 n) < O( n ) < O( n*log2 n) < O(n2) < O(n3) < O(2n) Function/n10 100 1,000 10,000 100,000 1,000,000 1 1 1 1 1 1 1 log2n 3 6 9 13 16 19 n 10 102 103 104 105 106 . . . 2n 103 1030 10301 103010 1030103 10301030
Efficiency Analysis • Evaluation can be calculated for • Best case • Shortest possible time to find an item • Not very informative • Average case • More difficult to perform than worst-case analysis • Worst case • The maximum amount of time that an algorithm can require to solve a problem of size N • An algorithm can require different times to solve different problems of the same size
Efficiency Analysis • Example: ListRetrieve • Array-based implementation: • nth item can be accessed directly O(1) • LL-based implementation: • nth item needs n steps to be retrieved O(n)
Efficiency Analysis • Example: ListRetrieve • Array-based implementation: • nth item can be accessed directly O(1) • LL-based implementation: • nth item needs n steps to be retrieved O(n)
Efficiency Analysis • When choosing ADT's implementation consider • Frequency • Consider how frequently particular ADT operations occur in a given application. • Example: • Word-processor's spelling checker frequently retrieves, but rarely inserts or deletes items. • Critical operations • Some rarely-used but critical operations must be efficient.
Searching & Sorting • Searching: • Looking for a specific item among a collection of data • Sorting: • Organizing a collection of data into either ascending or descending order according to a specific key • Sorting is often performed as an initialization step for certain searching algorithms
Searching & Sorting • Searching algorithms • Linear (Sequential) search • Binary search • Hashing (Chapter 13) • Sorting algorithms • Selection sort • Bubble sort • Insertion sort • Merge sort • Quick sort • Radix (Bucket) sort • Hashing (Chapter 13)
Sequential (Linear) Search • Analysis • Efficiency = 3 n + 2 • Best case: • O(1) -- when? • Worst case: • O(n) ---- when? • Average case • O(n/2) O(n) boolean linearSearch (int key, int data [], int n) { for(int i = 0 ; i < n ; i++) if (data[I]==key) return true; }
Binary Search • Elements should be sorted. • Much better than sequential search for large arrays. • Idea: • Divide the array in half. • Search only one of the halves. • Continue until: • Success: • Find the desired item. • Failure: • Reach an array of one data element that isn't the desired value.
Binary Search public static int binarySearch(Comparable[ ] anArray, Comparable target) { int index; int left = 0, right = anArray.length - 1; while( left <= right ) { index = (left + right) / 2; if( target.compareTo(anArray[index]) == 0) return index; if( target.compareTo(anArray[index]) > 0) left = index + 1; else right = index - 1; } return -1;// target not found }
32 16 8 4 2 1 0 Binary Search Analysis • Worst case - target is not found while( left <= right ) { index = (left + right) / 2; • On each loop iteration the array is (ideally) divided in half. • To determine f(n) consider number of items left in the array on each iteration: Pass 1 Pass 2 Pass 3 Pass 4 Pass 5 Pass 6 Pass 7
32 16 8 4 2 1 0 Pass 1 Pass 2 Pass 3 Pass 4 Pass 5 Pass 6 Pass 7 Binary Search Analysis Number of passes log2n n 5 7 32 4 6 16 3 5 8 2 4 4 1 3 2 ... 2 1
Binary Search Analysis • The relation between n and number of passes is: f(n) = log2n + 2 • And the time complexity is: O(log2n)
Review • Algorithm analysis should be independent of all of the following EXCEPT ______. • the programming style used in the implementation of the algorithm • the computer used to run a program which implements an algorithm • the number of significant operations in an algorithm • the test data used to test a program which implements an algorithm
Review • Assuming a linked list of n nodes, the code fragment: Node curr = head; while (curr != null) { System.out.println(curr.getItem()); curr.setNext(curr.getNext()); } // end while requires ______ comparisons. • n • n – 1 • n + 1 • 1
Review • Which of the following can be used to compare two algorithms? • growth rates of the two algorithms • implementations of the two algorithms • test data used to test programs which implement the two algorithms • computers on which programs which implement the two algorithms are run
Review • Given the statement:Algorithm A requires time proportional to f(n) Algorithm A is said to be ______. • in class f(n) • of degree f(n) • order f(n) • equivalent to f(n)
Review • If a problem of size n requires time that is directly proportional to n, the problem is ______. • O(1) • O(n) • O(n2) • O(2n)
Review • The value of which of the following growth-rate function grows the fastest? • O(n) • O(n2) • O(1) • O(log2n)