1 / 42

Compsci 201 Recursion+Sorting

Compsci 201 Recursion+Sorting. Owen Astrachan Jeff Forbes October 13, 2017. M is for …. Machine Learning Combining math, stats, compsci to make software "learn" at scale Markov Forgetting the past to predict the future Method A function by any other name Meme

lilika
Download Presentation

Compsci 201 Recursion+Sorting

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Compsci 201Recursion+Sorting Owen Astrachan Jeff Forbes October 13, 2017 Compsci 201, Fall 2017, Recursion+Sorting

  2. M is for … • Machine Learning • Combining math, stats, compsci to make software "learn" at scale • Markov • Forgetting the past to predict the future • Method • A function by any other name • Meme • I don't always use M-words, Compsci 201, Fall 2017, Recursion+Sorting

  3. Plan for the Day • Review key parts of the Percolation Assignment • Monte Carlo simulations • PercolationDFS and PercolationUF • Recursive depth-first search AND union-find • Sorting: how is recursion used and useful? • Solvingreal problems • When should recursion be avoided Compsci 201, Fall 2017, Recursion+Sorting

  4. Review WOTO for Blobs • http://bit.ly/201fall17-oct11-1 Compsci 201, Fall 2017, Recursion+Sorting

  5. How to search 8 neighbors? • We need to make 8 recursive calls • W, NW, N, NE, E, SE, S, SW • See coding “trick” below int[] rd = {0,-1,-1,-1,0,1,1,1}; int[] cd = {-1,-1,0,1,1,1,0,-1}; for(int d = 0; d < rd.length; d+= 1){ intnr = row + rd[d]; intnc = col + cd[d]; size += blobFill(nr,nc, …) } Compsci 201, Fall 2017, Recursion+Sorting

  6. Krysta Svore • Manages Microsoft Quantum Architectures and Computation Group (QuArC) “We think a quantum computer could possibly solve these [hard] types of problems in a time frame that’s more reasonable than the life of the universe, maybe a couple of years, or a couple of days, or a couple of seconds,” Svoresaid. “Exponentially faster.” Compsci 201, Fall 2017, Recursion+Sorting

  7. Percolation • Monte Carlo simulations – when and how? • Cornerstones of analyzing problems for which mathematical analysis may be intractable • Interfaces and Inheritance • Cornerstones of good object-oriented design • Building and analyzing tradeoffs • Cornerstones of Compsci 201 and algorithm and software design in general Compsci 201, Fall 2017, Recursion+Sorting

  8. Tracking Connected Cells • Use recursive DFS -- Depth First Search • Similar to Blob Count. Visit Neighbors • When a cell is open? Check fullness, recurse • N2 grid cells, could visit N2 cells • Union Find aka Disjoint Sets • Initially N2 disjoint sets numbered 0,1,… (N2-1) • Open a cell? UF.connect(i,j) • From cell i to each neighboring cell j Compsci 201, Fall 2017, Recursion+Sorting

  9. PercolationDFSFast (0,4) • Open (2,4) • Open (3,3) • Open (0,4) • Open (1,4) (0,4) (1,4) (1,4) (2,4) (2,4) (3,3) • What does “open” mean? • What does “full” mean? • How is state maintained and changed when a new grid cell is open? Compsci 201, Fall 2017, Recursion+Sorting

  10. PercolationUF VTOP • Open (2,4) • Open (3,3) • Open (0,4) • Connect(4,VTOP) • Open (1,4) • Connect(9,4) • Connect(9,14) • Open(4,3) • Connect(23,VBOTTOM) • Connect(23,14) (0,4) (1,4) (2,4) (3,3) (4,3) VBOTTOM Compsci 201, Fall 2017, Recursion+Sorting

  11. Sorting in Theory and in Practice Compsci 201, Fall 2017, Recursion+Sorting

  12. Sorting: From Theory to Practice • Will you use sorting in code you write? • Yes, Maybe, No, It Depends? • You will in Compsci 201 for sure!! • Why do we study sorting? • Elegant, practical, powerful, simple, complex • Everyone is doing it! • Example of algorithm analysis in a simple, useful setting Compsci 201, Fall 2017, Recursion+Sorting

  13. Sorting: From Theory to Practice • Why do we study more than one algorithm? • Some are good, some are bad, who cares? • Paradigms of trade-offs and algorithmic design • Which sorting algorithm is best? • Which sort should you call when writing code? • http://www.sorting-algorithms.com/ Not Yogi Compsci 201, Fall 2017, Recursion+Sorting

  14. Simple, O(n2) sorts • Selection sort --- n2 comparisons, n swaps • Find min, swap to front, increment front, repeat • Insertion sort --- n2 comparisons, no swap, shift • stable, fast on sorted data, slide into place • Bubble sort --- n2 everything, slow* • Catchy name, but slow and ugly* *this isn't everyone's opinion, but it should be • Shell sort: quasi-insertion, fast in practice • Not quadratic with some tweaks Compsci 201, Fall 2017, Recursion+Sorting

  15. Case Study: SelectionSort • https://coursework.cs.duke.edu/201fall17/sortall publicvoidsort(List<T> list) { intmin; for(intj=0; j < list.size()-1; j++) { min= j; for(intk=j+1; k < list.size(); k++) { if(list.get(k).compareTo(list.get(min)) < 0){ min= k; } } swap(list,min,j); } } Compsci 201, Fall 2017, Recursion+Sorting

  16. Case Study: SelectionSort • Invariant: on jth pass, [0,j) is in final sorted order • Code inside loop re-establishes invariant Final order unexamined j publicvoidsort(List<T> list) { intmin; for(intj=0; j < list.size()-1; j++) { min= j; for(intk=j+1; k < list.size(); k++) { if(list.get(k).compareTo(list.get(min)) < 0){ min= k; } } swap(list,min,j); } } Compsci 201, Fall 2017, Recursion+Sorting

  17. Aside: Loop Invariant • Statement that is (always) true each time the loop begins to execute • During loop execution it may become false • The loop re-establishes the invariant • Typically stated in terms of loop index • Helps to reason formally and informally about the code you’re writing • Can I explain the invariant to someone? Compsci 201, Fall 2017, Recursion+Sorting

  18. Recursive SelectionSort • Put inner loop into private, helper method to call privateintminIndex(List<T> list, intstart) { intmin = start; for(intk=start+1; k < list.size(); k++) { if(list.get(k).compareTo(list.get(min)) < 0) { min= k; } } returnmin; } privatevoiddoSort(List<T> list, intstart) { if(start >= list.size()) return; intdex = minIndex(list,start); swap(list,dex,start); doSort(list,start+1); } Compsci 201, Fall 2017, Recursion+Sorting

  19. Recursive SelectionSort • Recursive methods with arrays: add index, call helper method: first, last, -- Toward base case privatevoiddoSort(List<T> list, intstart) { if(start >= list.size()) return; intdex = minIndex(list,start); swap(list,dex,start); doSort(list,start+1); } public sort(List<T> list) { doSort(list,0); } Compsci 201, Fall 2017, Recursion+Sorting

  20. Recursive Terminology • Recursive methods must have a base case • Simple to do, don’t need “help” • Recursive calls make progress toward base case • Some measure gets smaller, toward base case • When do we stop sorting? • When sub-array or range has size 1 or 0. • Recursive calls bring range-size toward 1 or 0. Compsci 201, Fall 2017, Recursion+Percolation

  21. More efficient O(n log n) sorts • Divide and conquer sorts: O(n log n) for n elements • Quick sort: fast in practice, O(n2) worst case • Merge sort: good worst case, great for linked lists, uses extra storage for vectors/arrays • Timsort: http://en.wikipedia.org/wiki/Timsort • Other sorts: • Heap sort, basically priority queue sorting • Radix sort: uses digits/characters (no compare) • Cannot do better than O(n log n) • when comparing! Radix is O(n) ?? Compsci 201, Fall 2017, Recursion+Sorting

  22. Stable, Stability • What does search query 'stable sort' show us? • First sort by shape, then by color: Stable! • Why are numeric examples so popular? Compsci 201, Fall 2017, Recursion+Sorting

  23. Analyzing Sort Performance https://coursework.cs.duke.edu/201fall17/sortall • Let's look at overall structure, and algorithms • Try to see the trees, not the forest • Try to see the algorithm, not the syntax of Java • Appreciate power of Java interfaces and generics • Syntax is not pretty, but language is powerful • YABE, yet another benchmarking example Compsci 201, Fall 2017, Recursion+Sorting

  24. <= X X > X pivot index Quicksort: fast in practice • Invented in 1962 by Tony Hoare, didn't understand recursion • Worst case is O(n2), but avoidable in nearly all cases, shuffle data, smart pivot, etc. voiddoQuick(List<T> list, intfirst, intlast) { if(first >= last) return; intpiv = pivot(list,first,last); doQuick(list,first,piv-1); doQuick(list,piv+1,last); } Compsci 201, Fall 2017, Recursion+Sorting

  25. <= X X > X pivot index Pivot is O(n) • Invariant: [first,p) <= list.get(first) • Elements with index > p? could go in [first,p) privateintpivot(List<T> list,intfirst, intlast){ T piv = list.get(first); intp = first; for(intk=first+1; k <= last; k++){ if(list.get(k).compareTo(piv) <= 0){ p++; swap(list,k,p); } } swap(list,p,first); returnp; } Compsci 201, Fall 2017, Recursion+Sorting

  26. Mergesort: stable, fast, aux storage • Part of computing history vernacular, basis for Timsort • Uses extra storage (not for linked lists, next week!) publicvoidsort(List<T> list) { if(list.size() <= 1) return; inthalf = list.size()/2; List<T> a1 = newArrayList<>(list.subList(0, half)); List<T> a2 = newArrayList<>(list.subList(half, list.size())); sort(a1); sort(a2); merge(list,a1,a2); } Compsci 201, Fall 2017, Recursion+Sorting

  27. https://en.wikipedia.org/wiki/Timsort • Stable, O(n log n) in average and worst, O(n) best! • In practice lots of data is "close" to sorted • Invented by Tim Peters for Python, now in Java • Replaced merge sort which is also stable • Engineered to be correct, fast, useful in practice • Theory and explanation not so simple https://www.youtube.com/watch?v=NVIjHj-lrT4 Compsci 201, Fall 2017, Recursion+Sorting

  28. What's in Java 8? http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/8u40-b25/java/util/Arrays.java?av=f Compsci 201, Fall 2017, Recursion+Sorting

  29. Summary of O(n log n) sorts • Quicksort straight-forward to code, very fast • Worst case is very unlikely, but possible, therefore … • Can have bad performance without care when duplicate values • One million integers from range 0 to 10,000 • Change partition to handle this? Java 8! Used for int[] • Merge sort is stable, it's fast, good for linked lists, harder to code? • Worst case performance is O(n log n), compare quicksort • Extra storage for array/vector Compsci 201, Fall 2017, Recursion+Sorting

  30. Summary of O(n log n) sorts • Timsort: hybrid of merge and insertion? • Fast in wild/real world: Python, Java 7+, Android • What’s the best O(n log n) sort to call? • The one in the library you have access to • In Java that’s java.util.Arrays.sort or java.util.Collections.sort • Changing how you sort: .compareTo() or .compare() Compsci 201, Fall 2017, Recursion+Sorting

  31. sortingwoto or ginooorsttw • http://bit.ly/201fall17-sort1 Compsci 201, Fall 2017, Recursion+Sorting

  32. Bubble Sort, A Personal Odyssey

  33. Steve and Rachel, Duke 1997

  34. 11/08/77

  35. 17 Nov 75 Not needed Can be tightened considerably

  36. Jim Gray (Turing 1998) • Bubble sort is a good argument for analyzing algorithm performance. It is a perfectly correct algorithm. But it's performance is among the worst imaginable. So, it crisply shows the difference between correct algorithms and good algorithms. (italics ola’s)

  37. Brian Reid (Hopper Award 1982) Feah. I love bubble sort, and I grow weary of people who have nothing better to do than to preach about it. Universities are good places to keep such people, so that they don't scare the general public. (continued)

  38. Brian Reid (Hopper 1982) I am quite capable of squaring N with or without a calculator, and I know how long my sorts will bubble. I can type every form of bubble sort into a text editor from memory. If I am writing some quick code and I need a sort quick, as opposed to a quick sort, I just type in the bubble sort as if it were a statement. I'm done with it before I could look up the data type of the third argument to the quicksort library. I have a dual-processor 1.2 GHz Powermac and it sneers at your N squared for most interesting values of N. And my source code is smaller than yours. Brian Reid who keeps all of his bubbles sorted anyhow.

  39. Niklaus Wirth (Turing award 1984) I have read your article and share your view that Bubble Sort has hardly any merits. I think that it is so often mentioned, because it illustrates quite well the principle of sorting by exchanging. I think BS is popular, because it fits well into a systematic development of sorting algorithms. But it plays no role in actual applications. Quite in contrast to C, also without merit (and its derivative Java), among programming codes.

  40. Guy L. Steele, Jr. (Hopper ’88) (Thank you for your fascinating paper and inquiry. Here are some off-the-cuff thoughts on the subject. ) I think that one reason for the popularity of Bubble Sort is that it is easy to see why it works, and the idea is simple enough that one can carry it around in one's head … continued

  41. Guy L. Steele, Jr. As for its status today, it may be an example of that phenomenon whereby the first widely popular version of something becomes frozen as a common term or cultural icon. Even in the 1990s, a comic-strip bathtub very likely sits off the floor on claw feet. … it is the first thing that leaps to mind, the thing that is easy to recognize, the thing that is easy to doodle on a napkin, when one thinks generically or popularly about sort routines.

  42. Sorting Conundrums • You want to sort a million 32-bit integers • You're an advisor to Obama

More Related