CSE 30331 Lecture 3 – Algorithms I

CSE 30331Lecture 3 – Algorithms I • Algorithms & Pseudocode • Orders of Magnitude • Sequential Searching • Finding Minimum value • Linear Search • Sorting • Insertion Sort • Selection Sort • Binary Search

Algorithms & Pseudocode • Algorithm • Well defined computational process (sequence of steps) that takes some input(s) and produces some output(s) • Why discuss algorithms? • Facilitate design of software modules • Analyze costs (time and space) of software module in a manner that is independent of platform • Pseudocode • Method of writing algorithms that is independent of programming language

Constant Time Algorithms An algorithm is Θ(1) when its running time is independent of the number of data items. The algorithm runs in constant time. The storing of the element involves a simple assignment statement and thus has efficiency Θ(1).

Linear Time Algorithms An algorithm is Θ(n) when its running time is proportional to the size of the list. When the number of elements doubles, the number of operations doubles.

Less Efficient Algorithms • Quadratic Algorithms: Θ(n2) • practical only for relatively small values of n. When n doubles, running time quadruples. • Cubic Algorithms: Θ(n3) • efficiency is generally poor; doubling the size of n increases the running time eight-fold. • Exponential: Θ(2n) • Awful, but some algorithms have no better algorithmic solution

Algorithm Efficiency

Exponential Growth Ratesfrom(http://www.ics.uci.edu/~eppstein/265/exponential.html) Modern computers perform roughly 230 ops/sec. So ... ... how big can n be for a problem to still be solvable ... ... if the algorithm takes Θ(f(n)) time? Ops: 230 236 242 248 254 Time: 1sec 1min 1hr 3days >6months f(n) 2n 30 36 42 48 54 3n 19 23 26 30 34 n! 12 14 15 17 18 Nn 9 10 11 13 14 2n^2 5 6 6 7 7 n

Logarithmic Time Algorithms The logarithm of n, base 2, is commonly used when analyzing computer algorithms. Ex. lg(2) = log2(2) = 1 lg(1024) = log2(1024) = 10 When compared to the functions n and n2, the function lg(n) grows very slowly.

A Simple Linear Algorithm • Example: • Find the minimum element in an array A • Assume A contains n elements (n >= 1) • Assume elements are not ordered • Basic technique is sequential min(A,n) cost times smallest = A[0] c1 1 for i = 1 to (n-1) c2 n if A[i] < smallest c3 n-1 smallest = A[i] c4 0..n-1 return smallest c5 1

Analysis of min(A,n) min(A,n) cost times smallest = A[0] c1 1 for i = 1 to (n-1) c2 n if A[i] < smallest c3 n-1 smallest = A[i] c4 0..n-1 return smallest c5 1 • Best: A[0] is min • T(n) = c1 + c2(n) + c3(n-1) + 0 + c5 • T(n) = (c2+c3)(n) + (c1+c5-c3) = an + b linear or Θ(n) • Worst: A[n-1] is min and A[i] > A[i+1] for all i=[0,n) • T(n) = c1 + c2(n) + c3(n-1) + c4(n-1) + c5 • T(n) = (c2+c3+c4)(n) + (c1+c5-c3-c4) = cn + d linear or Θ(n)

Linear Search • Example: • Find the target element t in an array A • Assume A contains n elements (n >= 1) • Assume elements are not ordered • Basic technique is linear search • Return value is index of smallest element in range 0..n, where n indicates not found linearSearch(A,n,t) cost times i = 0 c1 1 while (i < n) and (A[i] != t) c2 1..n+1 i = i + 1 c3 0..n return i c4 1

Linear Search linearSearch(A,n,t) cost times i = 0 c1 1 while (i < n) and (A[i] != t) c2 1..n+1 i = i + 1 c3 0..n return i c4 1 • Worst: t is not in A • T(n) = c1 + c2(n+1) + c3(n) + c4 • T(n) = (c2+c3)(n) + (c1+c2+c4) = an + b linear or Θ(n) • Average: A[k] is t, where k = n/2 • T(n) = c1 + c2(k+1) + c3(k) + c4 • T(n) = (c2+c3)(k) + (c1+c2+c4) = cn + d • T(n) = ((c2+c3)/2)(n) + (c1+c2+c4) = cn + d linear or Θ(n)

Insertion Sort • On each pass the next element is inserted into the portion of the array previously sorted • The sort is “in place” • The array is logically separated into two parts [..sorted..] [..unsorted..] • The index j marks the location of the key and separates sorted from unsorted portions of the array

Example of Insertion Sort • [ ] [ 6 2 5 4 3 ] original array • [ 6 ] [ 2 5 4 3 ] prior to 1st iteration • [ 2 6 ] [ 5 4 3 ] after 1st iteration • [ 2 5 6 ] [ 4 3 ] after 2nd iteration • [ 2 4 5 6 ] [ 3 ] after 3rd iteration • [ 2 3 4 5 6 ] [ ] after last iteration

Quick aside ... Loop Invariants • We use loop invariants to determine/prove why an algorithm is correct! • Three things must be shown ... • Initialization: Invariant is true before first iteration • Maintenance: If invariant is true before an iteration it remains true before the next iteration • Termination: When loop terminates the invariant tells us something useful in showing the algorithm’s correctness

Insertion Sort • Invariant: • A[0..j-1] is the same collection of elements as in the previous iteration BUT they are now sorted • On each iteration • A[j] is the element that is inserted into A[0..j-1] producing a sorted A[0..j] • A[j+1..n-1] remains unsorted

Insertion Sort Algorithm • Sort n elements of A into non-decreasing order insertionSort(A,n) cost times for j = 1 to (n-1) c1 n key = A[j] c2 n-1 // insert A[j] into sorted sequence A[0..j-1] i = j – 1 c3 n-1 while (i >= 0) and (A[i] > key) c4 f A[i+1] = A[i] c5 g i = i – 1 c6 g A[i+1] = key c7 n-1 f = g =

Insertion Sort Analysis • Best: already presorted • tj is number of times the while loop checks limits and pairs of elements during shifting of elements in A[0..j-1] • The loop terminates on first check since A[j] <= A[j+1] for all j • T(n) = c1(n) + (c2+c3+c7)(n-1) + c4(f) + (c5+c6)(g) • tj = 1, so f = = (n-1) and g = = 0 • T(n) = (c1+c2+c3+c4+c7)n – (c2+c3+c4+c7) • Linear Θ(n)

Insertion Sort Analysis • Worst: presorted in reverse order • Shifting loop has to compare and shift every element in A[0..j-1] because initial order ensures A[k] > A[j] for all k < j • T(n) = c1(n) + (c2+c3+c7)(n-1) + c4(f) + (c5+c6)(g) • tj = j, so f = = n(n-1)/2 and g = = (n-1)(n-2)/2 • T(n) = c1(n) + (c2+c3+c7)(n-1) + c4(n(n-1)/2 ) + (c5+c6)((n-1)(n-2)/2) • T(n) = an2 + bn + c • Quadratic Θ(n2)

Selection Sort • On each pass the smallest element in the unsorted sub-array is selected for appending to the sorted sub-array • The sort is “in place” • The array is logically separated into two parts [..sorted..] [..unsorted..] • The index k separates sorted from unsorted portions of the array

Example of Selection Sort • [ ] [ 6 2 5 4 3 ] original array • [ 2 ] [ 6 5 4 3 ] after 1st iteration • [ 2 3 ] [ 5 4 6 ] after 2nd iteration • [ 2 3 4 ] [ 5 6 ] after 3rd iteration • [ 2 3 4 5 ] [ 6 ] after last iteration

Selection Sort • Invariant: • A[0..k-1] is sorted and all elements in A[0..k-1] are less than or equal to all elements in A[k..n-1] • On each iteration • A[smallIndex] is the element that is appended to A[0..k-1] producing a sorted A[0..k] • A[k+1..n-1] remains unsorted

Selection Sort Algorithm • Sort n elements of A into non-decreasing order selectionSort(A,n) cost times for k = 0 to n-2 c1 n // scan unsorted sublist to find smallest value smallIndex = k c2 n-1 for j = k+1 to n-1 c3 f if A[j] < A[smallIndex] c4 g smallIndex = j c5 h // swap smallest value with leftmost if smallIndex != k c6 n-1 swap(A[k],A[smallIndex]) c7 0..n-1 • f = g = h =

Selection Sort Analysis • Best: already presorted • tk is 0, since the leftmost element of the unsorted sub-array is always the smallest element • No swaps are performed, since the array is already in order • T(n) = c1(n) + (c2+c6)(n-1) + c3(f) + c4(g) + c5(h) • f = = n(n-1)/2 and g = = (n-1)(n-2)/2 h = 0 • T(n) = (c1+c2+c6)n – c6 + c3(n(n-1)/2) + c4((n-1)(n-2)/2) • T(n) = an2 + bn + c • Quadratic Θ(n2)

Selection Sort Analysis • Worst: presorted in reverse order • Searching for smallest requires changing smallIndex on every comparison in A[0..j-1] because initially A[k] > A[j] for all k < j • A swap occurs n-1 times • T(n) = c1(n) + (c2+c6+c7)(n-1) + c3(f) + c4(g) + c5(h) • tk = k, so f = = n(n-1)/2 and g = h = = (n-1)(n-2)/2 • T(n) = c1(n) + (c2+c6+c7)(n-1) + c3(n(n-1)/2) + (c4+c5)((n-1)(n-2)/2 ) • T(n) = an2 + bn + c • Quadratic Θ(n2)

Binary Search Algorithm Case 1: A match occurs. The search is complete and mid is the index that locates the target. if (midValue == target) // found match return mid;

Binary Search Algorithm Case 2: The value of target is less than midvalue and the search must continue in the lower sublist. if (target < midvalue) // search lower sublist <reposition last to mid> <search sublist arr[first]…arr[mid-1]

Binary Search Algorithm Case 3: The value of target is greater than midvalue and the search must continue in the upper sublist . if (target > midvalue)// search upper sublist <reposition first to mid+1> <search sublist arr[mid+1]…arr[last-1]>

Successful Binary Search Search for target = 23 Step 1: Indices first = 0, last = 9, mid = (0+9)/2 = 4. Since target = 23 > midvalue = 12, step 2 searches the upper sublist with first = 5 and last = 9.

Successful Binary Search Step 2: Indices first = 5, last = 9, mid = (5+9)/2 = 7. Since target = 23 < midvalue = 33, step 3 searches the lower sublist with first = 5 and last = 7.

Successful Binary Search Step 3: Indices first = 5, last = 7, mid = (5+7)/2 = 6. Since target = midvalue = 23, a match is found at index mid = 6.

Unsuccessful Binary Search Search for target = 4. Step 1: Indices first = 0, last = 9, mid = (0+9)/2 = 4. Since target = 4 < midvalue = 12, step 2 searches the lower sublist with first = 0 and last = 4.

Unsuccessful Binary Search Step 2: Indices first = 0, last = 4, mid = (0+4)/2 = 2. Since target = 4 < midvalue = 5, step 3 searches the lower sublist with first = 0 and last 2.

Unsuccessful Binary Search Step 3: Indices first = 0, last = 2, mid = (0+2)/2 = 1. Since target = 4 > midvalue = 3, step 4 should search the upper sublist with first = 2 and last =2. However, since first >= last, the target is not in the list and we return index last = 9.

Binary Search Implementation int binSearch (const int arr[], int first, int last, int target) { int origLast = last; // original value of last while (first < last) { int mid = (first+last)/2; if (target == arr[mid]) return mid; // a match so return mid else if (target < arr[mid]) last = mid; // search lower sublist else first = mid+1; // search upper sublist } return origLast; // target not found }

Binary Search Analysis • Each comparison halves the problem size • Assume for simplicity n = 2k • So, worst case … • How many times do we divide n in half before there are no more values to check? • T(n) = 1+ k = 1 + lg n • Logarithmic Θ(lg n)

CSE 30331 Lecture 3 – Algorithms I