720 likes | 1.01k Views
Introduction to Algorithms: Verification, Complexity, and Searching. Andy Wang Data Structures, Algorithms, and Generic Programming . Lecture Overview. Components of an algorithm Sequential search algorithms Binary search algorithms Proving algorithm correctness Computational complexity
E N D
Introduction to Algorithms: Verification, Complexity, and Searching Andy Wang Data Structures, Algorithms, and Generic Programming
Lecture Overview • Components of an algorithm • Sequential search algorithms • Binary search algorithms • Proving algorithm correctness • Computational complexity • TVector Retrospective
Algorithm Components • Required components • Assumptions • Asserted outcomes • Body • Proof • Optional components • Time complexity • Space complexity
Sequential Search • Goal • Find a specified value in a collection of values • Idea • Walk through the collection and test each value • A simple, commonly used, algorithm
Sequential Search Requirements • A way to differentiate things in the collection • A starting position • A way to move on to the next thing in the collection • A way to stop
Sequential Search Algorithm • Assumptions • Collection L of data of type T • Can iterate through L with begin(), next(), end(); • Outcomes • Decide whether t is in L • Return boolean (true/false)
Sequential Search Algorithm (2) • Body (in pseudocode) for (T item = begin(L); item != end(L); item = next(L)) { if (t == item) return true; } if (t == item) return true; return false;
Binary Search • Goal • Find a value in a collection of values • Idea • Divide and conquer
Binary Search (2) • Requirements • Collection must be “array”-like • Can use an index to jump to any array element • Collection must be sorted • Efficiency • Very fast • No extra space required
Binary Search—the idea You are heading down to an exotic restaurant… with something exotic on your mind… Chocolate……….Garlic……….Pasta … do not try this at home …
Binary Search—the idea 0. Baby beer 1. Blood pudding 2. Chocolate garlic pasta 3. Chocolate martini 4. Death by Chocolate 5. Garlic ice cream 6. Popcorn-flavored jelly beans 7. Saffron ice cream There are 8 items on the menu… sorted alphabetically…
Binary Search—the idea 0. Baby beer 1. Blood pudding 2. Chocolate garlic pasta 3. Chocolate martini 4. Death by Chocolate 5. Garlic ice cream 6. Popcorn-flavored jelly beans 7. Saffron ice cream The list is quite long…… it’s time to do a binary search…
Binary Search—the idea Search range: 0 - 7 0. Baby beer 1. Blood pudding 2. Chocolate garlic pasta 3. Chocolate martini 4. Death by Chocolate 5. Garlic ice cream 6. Popcorn-flavored jelly beans 7. Saffron ice cream Compare the middle element Chocolate garlic pasta < Chocolate martini
Binary Search—the idea Search range: 0 - 3 0. Baby beer 1. Blood pudding 2. Chocolate garlic pasta 3. Chocolate martini 4. Death by Chocolate 5. Garlic ice cream 6. Popcorn-flavored jelly beans 7. Saffron ice cream Compare the middle element Chocolate garlic pasta > Blood pudding
Binary Search—the idea Search range: 2 - 3 0. Baby beer 1. Blood pudding 2. Chocolate garlic pasta 3. Chocolate martini 4. Death by Chocolate 5. Garlic ice cream 6. Popcorn-flavored jelly beans 7. Saffron ice cream Compare the middle element Chocolate garlic pasta == Chocolate garlic pasta
Binary Search—the idea Yum……………………………………
Binary Search Algorithm • Three versions • Binary_search • Lower_bound • Upper_bound
Binary Search Algorithm (2) • Assumptions • Collection L of data type of T with size sz • L is sorted • Element t of type T • Outcomes • Binary_search: true if t in L; false, otherwise • Lower_bound: smallest j, where t <= L[j] • Upper_bound: smallest j, where t < L[j]
Lower_bound Code unsigned int lower_bound(T* L, unsigned max, T t) { unsigned int low = 0, mid, high = max; while (low < high) { mid = (low + high) / 2; if (L[mid] < t) { low = mid + 1; } else { high = mid; } } return low; }
Lower_bound Code t = “Chocolate Garlic Pasta” low = 0; high = 7; while (0 < 7) { mid = (0 + 7) / 2 = 3; if (L[3] < t) { low = mid + 1; } else { high = mid; } } return low; 0. Baby beer 1. Blood pudding 2. Chocolate garlic pasta 3. Chocolate martini 4. Death by Chocolate 5. Garlic ice cream 6. Popcorn-flavored jelly beans 7. Saffron ice cream
Lower_bound Code t = “Chocolate Garlic Pasta” low = 0; high = 7; while (0 < 7) { mid = (0 + 7) / 2 = 3; if (L[3] < t) { low = mid + 1; } else { high = mid; } } return low; 0. Baby beer 1. Blood pudding 2. Chocolate garlic pasta 3. Chocolate martini 4. Death by Chocolate 5. Garlic ice cream 6. Popcorn-flavored jelly beans 7. Saffron ice cream
Lower_bound Code t = “Chocolate Garlic Pasta” low = 0; high = 3; while (0 < 3) { mid = (0 + 3) / 2 = 1; if (L[1] < t) { low = mid + 1; } else { high = mid; } } return low; 0. Baby beer 1. Blood pudding 2. Chocolate garlic pasta 3. Chocolate martini 4. Death by Chocolate 5. Garlic ice cream 6. Popcorn-flavored jelly beans 7. Saffron ice cream
Lower_bound Code t = “Chocolate Garlic Pasta” low = 0; high = 3; while (0 < 3) { mid = (0 + 3) / 2 = 1; if (L[1] < t) { low = mid + 1; } else { high = mid; } } return low; 0. Baby beer 1. Blood pudding 2. Chocolate garlic pasta 3. Chocolate martini 4. Death by Chocolate 5. Garlic ice cream 6. Popcorn-flavored jelly beans 7. Saffron ice cream
Lower_bound Code t = “Chocolate Garlic Pasta” low = 2; high = 3; while (2 < 3) { mid = (2 + 3) / 2 = 2; if (L[2] < t) { low = mid + 1; } else { high = mid; } } return low; 0. Baby beer 1. Blood pudding 2. Chocolate garlic pasta 3. Chocolate martini 4. Death by Chocolate 5. Garlic ice cream 6. Popcorn-flavored jelly beans 7. Saffron ice cream
Lower_bound Code t = “Chocolate Garlic Pasta” low = 2; high = 3; while (2 < 3) { mid = (2 + 3) / 2 = 2; if (L[2] < t) { low = mid + 1; } else { high = mid; } } return low; 0. Baby beer 1. Blood pudding 2. Chocolate garlic pasta 3. Chocolate martini 4. Death by Chocolate 5. Garlic ice cream 6. Popcorn-flavored jelly beans 7. Saffron ice cream
Lower_bound Code t = “Chocolate Garlic Pasta” low = 2; high = 2; while (2 < 2) { … } return low = 2; 0. Baby beer 1. Blood pudding 2. Chocolate garlic pasta 3. Chocolate martini 4. Death by Chocolate 5. Garlic ice cream 6. Popcorn-flavored jelly beans 7. Saffron ice cream
Upper_bound Code unsigned int upper_bound(T* L, unsigned max, T t) { unsigned int low = 0, mid, high = max; while (low < high) { mid = (low + high) / 2; if (t < L[mid]) { high = mid; } else { low = mid + 1; } } return low; }
Upper_bound Code t = “Chocolate Garlic Pasta” low = 0; high = 7; while (0 < 7) { mid = (0 + 7) / 2 = 3; if (t < L[3]) { high = mid; } else { low = mid + 1; } } return low; 0. Baby beer 1. Blood pudding 2. Chocolate garlic pasta 3. Chocolate martini 4. Death by Chocolate 5. Garlic ice cream 6. Popcorn-flavored jelly beans 7. Saffron ice cream
Upper_bound Code t = “Chocolate Garlic Pasta” low = 0; high = 7; while (0 < 7) { mid = (0 + 7) / 2 = 3; if (t < L[3]) { high = mid; } else { low = mid + 1; } } return low; 0. Baby beer 1. Blood pudding 2. Chocolate garlic pasta 3. Chocolate martini 4. Death by Chocolate 5. Garlic ice cream 6. Popcorn-flavored jelly beans 7. Saffron ice cream
Upper_bound Code t = “Chocolate Garlic Pasta” low = 0; high = 3; while (0 < 3) { mid = (0 + 3) / 2 = 1; if (t < L[1]) { high = mid; } else { low = mid + 1; } } return low; 0. Baby beer 1. Blood pudding 2. Chocolate garlic pasta 3. Chocolate martini 4. Death by Chocolate 5. Garlic ice cream 6. Popcorn-flavored jelly beans 7. Saffron ice cream
Upper_bound Code t = “Chocolate Garlic Pasta” low = 0; high = 3; while (0 < 3) { mid = (0 + 3) / 2 = 1; if (t < L[1]) { high = mid; } else { low = mid + 1; } } return low; 0. Baby beer 1. Blood pudding 2. Chocolate garlic pasta 3. Chocolate martini 4. Death by Chocolate 5. Garlic ice cream 6. Popcorn-flavored jelly beans 7. Saffron ice cream
Upper_bound Code t = “Chocolate Garlic Pasta” low = 2; high = 3; while (2 < 3) { mid = (2 + 3) / 2 = 2; if (t < L[2]) { high = mid; } else { low = mid + 1; } } return low; 0. Baby beer 1. Blood pudding 2. Chocolate garlic pasta 3. Chocolate martini 4. Death by Chocolate 5. Garlic ice cream 6. Popcorn-flavored jelly beans 7. Saffron ice cream
Upper_bound Code t = “Chocolate Garlic Pasta” low = 2; high = 3; while (2 < 3) { mid = (2 + 3) / 2 = 2; if (t < L[2]) { high = mid; } else { low = mid + 1; } } return low; 0. Baby beer 1. Blood pudding 2. Chocolate garlic pasta 3. Chocolate martini 4. Death by Chocolate 5. Garlic ice cream 6. Popcorn-flavored jelly beans 7. Saffron ice cream
Upper_bound Code t = “Chocolate Garlic Pasta” low = 3; high = 2; while (3 < 2) { … } return low = 3; 0. Baby beer 1. Blood pudding 2. Chocolate garlic pasta 3. Chocolate martini 4. Death by Chocolate 5. Garlic ice cream 6. Popcorn-flavored jelly beans 7. Saffron ice cream
Binary_search Code unsigned int binary_search(T* L, unsigned sz, T t) { unsigned int lb = lower_bound(L, sz – 1, t); if (lb < sz) { if (t == L[lb]) { return true; } } return false; }
If there are duplicate entries… 0. Baby beer 1. Blood pudding 2. Chocolate garlic pasta 3. Chocolate garlic pasta 4. Death by Chocolate 5. Garlic ice cream 6. Popcorn-flavored jelly beans 7. Saffron ice cream Lower bound Upper bound
If t is not in L… 0. Baby beer 1. Blood pudding 2. Chocolate banana crepe 3. Chocolate martini 4. Death by Chocolate 5. Garlic ice cream 6. Popcorn-flavored jelly beans 7. Saffron ice cream Lower bound and upper bound
Issues of Proof • Correctness • Termination • Correct outcome • Performance • Time complexity • Space complexity
Correctness and Loop Invariants • Correctness • Loop termination • State when entering the loop • State when exiting the loop • Loop invariants • Conditions that remain true for each iteration • Mathematical induction
What can go wrong? for (j = 0; j < n; ++j) { compute(j); } • void compute(unsigned int &j) { --j; } • n < 0 at the beginning
Invariants—Sequential Search boolean sequential_search { for (T item = first(L); item != end(L); item = next(L)) { // item is not the final item in the collection // current item has not been examined progress // L is finite if (t == item) { return true; } // t does not match the current item } if (t == item) { return true; } return false; }
Invariants—Binary Search unsigned int lower_bound(T* L, unsigned max, T t) { unsigned int low = 0, mid, high = max; while (low < high) { // (1) low < high // (2) L[low - 1] < t <= L[high] (if index is valid) mid = (low + high) / 2; if (L[mid] < t) { low = mid + 1; } else { high = mid; } // (3) low <= high // (4) high – low has decreased // (5) L[low - 1] < t <= L[high] (if index is valid) } return low; }
Invariants—Binary Search unsigned int lower_bound(T* L, unsigned max, T t) { unsigned int low = 0, mid, high = max; while (low < high) { // (1) low < high // (2) L[low - 1] < t <= L[high] (if index is valid) mid = (low + high) / 2; if (L[mid] < t) { low = mid + 1; } else { high = mid; } // (3) low <= high // (4) high – low has decreased // (5) L[low - 1] < t <= L[high] (if index is valid) } return low; } t does not have to be L
low = mid + 1 = (old_low + high)/2 + 1 <= (old_low + high)/2 + 1 < (high + high)/2 + 1 = high + 1 low < high + 1 low <= high Invariants—Binary Search unsigned int lower_bound(T* L, unsigned max, T t) { unsigned int low = 0, mid, high = max; while (low < high) { // (1) low < high // (2) L[low - 1] < t <= L[high] mid = (low + high) / 2; if (L[mid] < t) { low = mid + 1; } else { high = mid; } // (3) low <= high // (4) high – low has decreased // (5) L[low –1 ] < t <= L[high] (if index is valid) } return low; }
high = mid = (low + old_high)/2 > (low + old_high)/2 - 1 > (low + low)/2 - 1 high > low – 1 high >= low Invariants—Binary Search unsigned int lower_bound(T* L, unsigned max, T t) { unsigned int low = 0, mid, high = max; while (low < high) { // (1) low < high // (2) L[low – 1] < t <= L[high] mid = (low + high) / 2; if (L[mid] < t) { low = mid + 1; } else { high = mid; } // (3) low <= high // (4) high – low has decreased // (5) L[low - 1] < t <= L[high] (if index is valid) } return low; }
high – low = high – (mid + 1) = high – mid – 1 < high - mid = high – (old_low + high)/2 <= high – (old_low + high)/2 = (high – old_low)/2 = (old_high – old_low)/2 high – low < (old_high – old_low)/2 Invariants—Binary Search unsigned int lower_bound(T* L, unsigned max, T t) { unsigned int low = 0, mid, high = max; while (low < high) { // (1) low < high // (2) L[low - 1] < t <= L[high] (if index is valid) mid = (low + high) / 2; if (L[mid] < t) { low = mid + 1; } else { high = mid; } // (3) low <= high // (4) high – low has decreased // (5) L[low - 1] < t <= L[high] (if index is valid) } return low; }
Invariants—Binary Search high – low = mid – low = (low + old_high)/2 – low <= (low + old_high)/2 – low = (old_high - low)/2 = (old_high - old_low)/2 high – low <= (old_high – old_low)/2 unsigned int lower_bound(T* L, unsigned max, T t) { unsigned int low = 0, mid, high = max; while (low < high) { // (1) low < high // (2) L[low - 1] < t <= L[high] (if index is valid) mid = (low + high) / 2; if (L[mid] < t) { low = mid + 1; } else { high = mid; } // (3) low <= high // (4) high – low has decreased // (5) L[low - 1] < t <= L[high] (if index is valid) } return low; }
Invariants—Binary Search unsigned int lower_bound(T* L, unsigned max, T t) { unsigned int low = 0, mid, high = max; while (low < high) { // (1) low < high // (2) L[low - 1] < t <= L[high] (if index is valid) mid = (low + high) / 2; if (L[mid] < t) { low = mid + 1; } else { high = mid; } // (3) low <= high // (4) high – low has decreased // (5) L[low - 1] < t <= L[high] (if index is valid) } return low; } L[low - 1] = L[mid] < t Since high is not changed, t <= L[high]
Invariants—Binary Search unsigned int lower_bound(T* L, unsigned max, T t) { unsigned int low = 0, mid, high = max; while (low < high) { // (1) low < high // (2) L[low - 1] < t <= L[high] (if index is valid) mid = (low + high) / 2; if (L[mid] < t) { low = mid + 1; } else { high = mid; } // (3) low <= high // (4) high – low has decreased // (5) L[low - 1] < t <= L[high] (if index is valid) } return low; } Since low is unchanged, L[low] <= t L[high] = L[mid] >= t
Invariants—Binary Search unsigned int lower_bound(T* L, unsigned max, T t) { unsigned int low = 0, mid, high = max; while (low < high) { // (1) low < high // (2) L[low - 1] < t <= L[high] (if index is valid) mid = (low + high) / 2; if (L[mid] < t) { low = mid + 1; } else { high = mid; } // (3) low <= high // (4) high – low has decreased // (5) L[low - 1] < t <= L[high] (if index is valid) } return low; } Termination: (3) shows that the loop can terminate (4) shows progress