520 likes | 721 Views
241-423 Advanced Data Structures and Algorithms. Objective introduce algorithm design using basic searching and sorting, and remind students about T() and Big-Oh running time. Semester 2, 2013-2014. 1 . Intro. to Algorithms. Contents. 1. Selection Sort 2. An Array Sublist
E N D
241-423 Advanced Data Structures and Algorithms Objective introduce algorithm design using basic searching and sorting, and remind students about T() and Big-Oh running time. Semester 2, 2013-2014 1. Intro. to Algorithms
Contents 1. Selection Sort 2. An Array Sublist 3. Sequential Search 4. Binary Search 5. System/Memory Efficiency 6. Running Time Analysis 7. Big-Oh Analysis 8. The Timing Class
1. Selection Sort • Go through an array position by position, starting at index 0. • At the current position, select the smallest element from the rest of the array. • Swap it with the value in the current position.
Number of passes = size of array – 1 • e.g. made 4 passes over the 5-element arr[] • stopped sorting when finshed comparing arr[3] and arr[4]
public static void selectionSort(int[] arr) { // index of smallest elem in sublist int smallIndex; int idx; int n = arr.length; // idx has range 0 to n-2 for (idx = 0; idx < n-1; idx++) { // scan sublist starting at idx smallIndex = idx; : selectionSort()
/* j goes through sublist from arr[idx+1] to arr[n-1] */ for (int j = idx+1; j < n; j++) /* if smaller element found, assign smallIndex to that posn */ if (arr[j] < arr[smallIndex]) smallIndex = j; //* swap next smallest elem into arr[idx] int temp = arr[idx]; arr[idx] = arr[smallIndex]; arr[smallIndex] = temp; } } // end of selectionSort()
Usage Example // an integer array int[] arr = {66, 20, 33, 55, 53, 57, 69, 11, 67, 70}; // use selectionSort() to order array selectionSort(arr); System.out.print("Sorted: "); for (int i=0; i < arr.length; i++) System.out.print(arr[i] + " "); Sorted: 11 20 33 53 55 57 66 67 69 70
2. An Array Sublist • An array sublist is a sequence of elements whose indices begin at index first and go up to, but not including, last. • Uses the notation: [first, last).
3. Sequential Search • Begin with a target value and index range [first, last). • Go through sublist item by item, looking for target. • Return the index position of the match or -1 if target is not in sublist.
seqSearch() public static int seqSearch(int[] arr, int first, int last, int target) { /* scan first <= i < last; return index for position if a match occurs */ for (int i = first; i < last; i++) if (arr[i] == target) return i; return -1; // target not found }
4. Binary Search • Binary search requires an ordered list, so large sections of the list can be skipped during the search. • Calculate midpoint of the current sublist [first,last). • If target matches midpoint value, then the search is finished. continued
If target is less than midpoint value, look in the lower sublist; otherwise, look in the upper sublist. • Continue until target is found or sublist size is 0.
Binary Search Case 1 • target == midpoint value. • The search is complete • mid is the index of the midpoint value
Binary Search Case 2 • target < midValue • so search in lower sublist • Index range becomes [first, mid). • Set index last to be end of lower sublist (last = mid).
Binary Search Case 3 • target > midValue • so search upper sublist • Index range becomes [mid+1,last), because the upper sublist starts to the right of mid • Set index first to be front of the upper sublist (first = mid+1).
Binary Search Finish • The binary search stops when a match is found, or when the sublist is 'empty'. • an empty sublist for [first,last) is when first >= last.
A Successful Search: Step 1 target = 23 continued
Step 2 target = 23 The sublist is roughly halved. continued
Step 3 target = 23 The sublist is roughly halved. continued
Failure Example Step 1 target = 4 continued
Step 2 target = 4 The sublist is roughly halved. continued
Step 3 target = 4 The sublist is roughly halved. continued
Step 4 Index range [2,2). first ≥ last, so search fails. The return value is -1.
binSearch() public static int binSearch(int arr[], int first, int last, int target) { int mid; // index of midpoint int midValue; // value from arr[mid] // test for nonempty sublist while (first < last) { mid = (first+last)/2; midValue = arr[mid]; :
if (target == midValue) return mid; // have a match // determine which sublist to search else if (target < midValue) // search lower sublist; set last last = mid; else // search upper sublist; set first first = mid+1; } return -1; // target not found } // end of binSearch()
5. System/Memory Efficiency • System efficiency is how fast an algorithm runs on a particular machine. • Memory efficiency is the amount of memory an algorithm uses • if an algorithm uses too much memory, it can be too slow, or may not execute at all, on a particular system.
6. Running Time Analysis • Machine-independent algorithm efficiency is measured in terms of the number of operations used in the code. • The complexity of the algorithm usually depends on some size measure • usually the size of the input data
input data is the array, arr[] min() public static int min(int[] arr) // return the smallest elem. in arr[] { int n = arr.length; if (n == 0) { System.out.println("Array has 0 size"); return 0; } else { int min = arr[0]; for (int i = 1; i < n; i++) if (arr[i] < min) min = arr[i]; return min; } }
Running Time Could count all operations, but T() would still be linear in n. • The number of comparison operations, T(n), required to find the smallest element in an n-element array. T(n) = n-1 T() was explained in241-303 "Discrete Maths",part 4
Running Time: Selection Sort • Count the number of comparison operations used to sort an array of size n • there are n-1 passes altogether • in the first pass there are n-1 comparisons • in the 2nd pass, n-2 comparisons, ... T(n) = (n-1) + (n-2) + ... + 2 + 1 = n(n-1)/2 = n2/2 - n/2
Running Time: seqSearch() • Best case: Find target at index 0. T(n) = 1 • Worst case: Find target at index n-1 or not finding it. T(n) = n • Average case: Average of the number of comparisons to find a target at any position. T(n) = (1+2+3...+n)/n = n(n+1)/2 * (1/n) = (n+1)/2
Running Time: binSearch() • Best case: Target found at first midpoint. T(n) = 1 • Worst case: Length of sublists halves at each iteration. T(n) = (int) log2n + 1 • Average case: A fancy analysis shows: T(n) = (int) log2n
7. Big-Oh Notation • Big-Oh, O(n) is a simpler version of T(n) that only uses the 'biggest' term of the T(n) equation, without constants. • e.g. if T(n) = 8n3+5n2-11n+1, then T(n) is O(n3). • Selection sort is O(n2). • The average case for seqSearch() is O(n). • The worst case for binSearch() is O(log2n). O() was explained in241-303 "Discrete Maths",part 4
Common Big-Oh's • Constant time: T(n) is O(1) when its running time is independent of the n value. • e.g. find the smallest value in an orderedn-element array continued
Linear: T(n) is O(n) when running time is proportional to n, the size of the data. If n doubles, T() doubles. • e.g. find the smallest value in an unorderedn-element array, as in min() continued
Quadratic: T(n) is O(n2). If n doubles, T() increases by a factor of 4 • e.g. selection sort • Cubic: T(n) is O(n3). Doubling n increases T() by a factor of 8 • e.g. multiplication of two n*n matricies continued
Logarithmic: T() is O(log2n) or O(n log2n). • occurs when the algorithm repeatedly subdivides the data into sublists whose sizes are 1/2, 1/4, 1/8, ... of the original size n • e.g. binary search is O(log2n) • e.g. quicksort is O(n log2n) continued
Exponential: T(n) is O(an). • These algorithms deal with problems that require searching through a large number of potential solutions before finding an answer. • e.g. the traveling salesman problem • mentioned in the Discrete Maths subject
A class in the Ford&Toppds.time package. 8. The Timing Class
Timing Class Example Timing sortTimer = new Timing(); sortTimer.start(); // start timing selectionSort(arr); // sort double timeInSec = sortTimer.stop(); // get sorting time in secs
Search Time Comparison • Compare sequential and binary search on the folllowing problem: target list 103 51 27 10 . . . . . . . 99 0 1 2 3 49,999 search for each inside ... listSeq (listBin after sorting) . . . . . . . 51 34 22 82 56 0 1 2 3 99,999
SearchTimes import java.util.Random; import java.text.DecimalFormat; import ds.util.Arrays; // Ford & Topp packages import ds.time.Timing; public class SearchTimes { public static void main(String[] args) { int ARRAY_SIZE = 100000; int TARGET_SIZE = 50000; // arrays for searches int[] listSeq = new int[ARRAY_SIZE], listBin = new int[ARRAY_SIZE], targetList = new int[TARGET_SIZE]; :
// use Timing object t to compute times Timing t = new Timing(); // random number object Random rnd = new Random(); // format real numbers with 3 dps DecimalFormat fmt = new DecimalFormat("#.000"); :
// initialize arrays with random numbers for (int i = 0; i < ARRAY_SIZE; i++) listSeq[i]=listBin[i]= rnd.nextInt(1000000); // initialize targetList with random numbers for (int i=0; i < TARGET_SIZE; i++) targetList[i] = rnd.nextInt(1000000); // time seq. search for targets in listSeq t.start(); for (int i = 0; i < TARGET_SIZE; i++) Arrays.seqSearch(listSeq, 0, ARRAY_SIZE, targetList[i]); :
double seqTime = t.stop(); System.out.println("Sequential Search takes " + fmt.format(seqTime) + " seconds."); // sort listBin Arrays.selectionSort(listBin); // time binary search for targets in listBin t.start(); for (int i = 0; i < TARGET_SIZE; i++) Arrays.binSearch(listBin, 0, ARRAY_SIZE, targetList[i]); :
double binTime = t.stop(); System.out.println("Binary Search takes " + fmt.format(binTime) + " seconds."); System.out.println("Ratio of sequential to binary search time is " + fmt.format(seqTime/binTime)); } // end of main() } // end of SearchTimes class