860 likes | 1.02k Views
CS 2110 Final Review. Topics. Will be covered Important Programming Concepts Types, Recursion/Induction, Asymptotic Complexity The Toolbox (Data Structures) Arrays, Linked Lists, Trees (BSTs, Heaps), Hashtables Practical Things Searching, Sorting, Abstract Data Types Graphs
E N D
Topics • Will be covered • Important Programming Concepts • Types, Recursion/Induction, Asymptotic Complexity • The Toolbox (Data Structures) • Arrays, Linked Lists, Trees (BSTs, Heaps), Hashtables • Practical Things • Searching, Sorting, Abstract Data Types • Graphs • Threads and Concurrency
Topics • Sort of Important • Don’t have a panic attack over these topics! • GUIS • Just do the problems on old prelims/finals • Software engineering • Don’t write horrible code on coding questions • But they are both important in the real world!
Topics • Will Not Be Covered • Lectures 24 and above • Java Virtual Machine • Distributed Systems and Cloud Computing • Balancing Trees (e.g.: AVL trees) • But do know what a balanced and unbalanced tree is • Recurrences • Network Flow
Typing • Primitive Types • boolean, int, double, etc… • Test equality with == and != • Compare with <, <=, >, and >=
Pass by Value void f(int x) { x--; } int x = 10; f(x); // x == 10 f 10 f 9 main 10
Typing • Reference types • Actual object is stored elsewhere • Variable contains a reference to the object • == tests equality for the reference • equals() tests equality for the object • x == y implies that x.equals(y) • x.equals(y) does not imply x == y • How do we compare objects of type T? • Implement the Comparable<T> interface
Pass by Reference {} void f(ArrayList<Integer> l) { l.add(2); l = new ArrayList<Integer>(); } ArrayList<Integer> l = new ArrayList<Integer >(); l.add(1); f(l); // l contains 1, 2 f l main l {} {1} {1,2}
Typing • We know that type B can implement/extend A • B is a subtype of A; A is a supertype of B • The real type of the object is its dynamic type • This type is known only at run-time • Object can act like asupertype of its dynamic type • Cannot act like a subtype of its dynamic type • Variables/arguments of type A accept any subtype of A • A is a supertype of the static type • Static type is a supertypeof the dynamic type
Typing • Static type is compile-time, determined by code • Dynamic type might be a subtype of the static type • Casting temporarily changes the static type • Upcasts are always safe • Always cast to a supertype of the dynamic type • Downcasts may not be safe • Can downcast to a supertype of the dynamic type • Cannot downcast to a subtype of the dynamic type
Typing • If B extends A, and B and A both have function foo() • Which foo() gets called? • Answer depends on the dynamic type • If the dynamic type is B, B’s foowill() be called • Even iffoo() is invoked inside a function of A • Exception: static functions • Static functions are not associated with any object • Thus, they do not have any type
Induction and Recursion • Recursion • Basic examples • Factorial : n! = n(n-1)! • Combinations Pascal’s triangle • Recursive structure • Tree (tree t = root with right/left subtree) • Depth first search • Don’t forget base case (in proof, in your code)
Induction and Recursion • Induction • Can do induction on previous recursive problems • Algorithm correctness proof (DFS) • Math equation proof
Induction and Recursion • Step 1 • Base case • Step 2 • suppose n is the variable you’re going to do induction on. Suppose the equation holds when n=k • Strong induction: suppose it holds for all n<=k • Step 3 • prove that when n = k+1, equation still holds, by making use of the assumptions in Step 2.
Asymptotic Complexity • f(n) is O(g(n)) if: • ∃ (c, n0) such that ∀ n ≥ n0, f(n) ≤ c⋅g(n) • ∃ - there exists; ∀ - for all • (c, n0) is called the witness pair • f(n) is O(g(n)) roughly means f(n) ≤ g(n) • Big-O notation is a model for running time • Don’t need to know the computer’s specs • Models usually but do not always work in real life
Asymptotic Complexity • Meaning of n0 • We can compare two integers • How can we compare two functions? • Answer is which function grows faster • Fast vs. slow car • 60-mph car with no headstart • 40-mph car with a headstart • n0 is when the fast car overtakes the slow one • Functions have a dominant term • 2n3 + 4n2 + n + 2: 2n3 is the dominant term
Asymptotic Complexity • Meaning of c • Cannot get a precise measurement • Famous election recounts (Bush/Gore, Coleman/Franken) • Algorithm’s speed on a 2 GHz versus a 1 GHz processor • Hard to measure constant for the dominant term • c is the fudge factor • Change the speed by a constant factor • 2GHz is at most twice as fast as a 1 GHz (c = 2) • 2n3 is O(n3) • Fast and slow car have asymptotically equal speed
Asymptotic Complexity • (Assume c, d are constants) • Logarithmic vs. Logarithmic: log(nc), log(nd) “equal” • Difference is a constant factor ( log(nc) = c log(n) ) • Logarithm’s base also does not matter • Logarithmic vs. Polynomial: log n is O(nc) • Corollary: O(n log n) better than O(n2) • Polynomial vs. Polynomial: nc is O(nd), c ≤ d • Polynomial vs. Exponential: nc is O(dn) • Exponential running time almost always too slow!
Arrays • Arrays have a fixed capacity • If more space needed… • Allocate larger array • Copy smaller array into larger one • Entire operations takes O(n) time • Arrays have random access • Can read any element in O(1) times
(Doubly) Linked Lists • Each node has three parts • Value stored in node • Next node • Previous node • Also have access to head, tail of linked list • Very easy to grow linked list in both directions • Downside: sequential access • O(n) time to access something in the middle
Trees • Recursive data structure • A tree is… • A single node • Zero or more subtrees below it • Every node (except the root) has one parent • Properties of trees must hold for all nodes in the tree • Each node is the root of some tree • Makes sense for recursive algorithms
Binary Trees • Each node can have at most two children • We usually distinguish between left, right child • Trees we study in CS 2110 are binary trees
Binary Search Trees • Used to sort data inside a tree • For every node with value x: • Every node in the left subtree has a value < x • Every node in the right subtree has a value > x • Binary search tree is not guaranteed to be balanced • If it is balanced, we can find a node in O(log n) time
Binary Search Trees Completely unbalanced, but still a BST
Not a Binary Search Tree 8 is in the left subtree of 5
Not a Binary Search Tree 3 is the left child of 2
Adding to a Binary Search Tree • Adding 4 to the BST • Start at root (5) • 4 < 5 • Go left to 2 • 4 > 2 • Go right to 3 • 4 > 3 • Add 4 as right child of 3
Tree Traversals • Converts the tree into a list • Works on any binary tree • Do not need a binary search tree • Traverse node and its left and right subtrees • Subtreesare traversed recursively • Preorder: node, left subtree, right subtree • Inorder: left subtree, node, right subtree • Produces a sorted list for binary search trees • Postorder: left subtree, rightsubtree, node
Tree Traversals • Inorder Traversal • In(5) • In(2), 5, In(7) • 1, 2, In(3), 5, In(7) • 1, 2, 3, 4, 5, In(7) • 1, 2, 3, 4, 5, 6, 7, 9
Binary Heaps • Weaker condition on each node • A node is smaller than its immediate children • Don’t care about entire subtree • Don’t care if right child is smaller than left child • Smallest node guaranteed to be at the root • No guarantees beyond that • Guarantee also holds for each subtree • Heaps grow top-to-bottom, left-to-right • Shrink in opposite direction
Binary Heaps • Adding to a heap • Find lowest unfilled level of tree • Find leftmost empty spot, add node there • New node could be smaller than its parent • Swap it up if its smaller • If swapped, could still be smaller than its new parent • Keep swapping up until the parent is smaller
Binary Heaps • Removing from a heap • Take out element from the top of the heap • Find rightmost element on lowest level • Make this the new root • New root could be larger than one/both children • Swap it with the smallest child • Keep swapping down…
Binary Heaps • Can represent a heap as an array • Root element is at index 0 • For an element at index i • Parent is at index (i – 1)/2 • Children are at indices 2i + 1, 2i + 2 • n-element heap takes up first n spots in the array • New elements grow heap by 1 • Removing an element shrinks heap by 1
Hashtables • Motivation • Sort n numbers between 0 and 2n – 1 • Sorting integers within a certain range • More specific than comparable objects with unlimited range • General lower bound of O(n log n) may not apply • Can be done in O(n) time with counting sort • Create an array of size 2n • The ith entry counts all the numbers equal to i • For each number, increment the correct entry • Can also find a given number in O(1) time
Hashtables • Cannot do this with arbitrary data types • Integer alone can have over 4 billion possible values • For a hashtable, create an array of size m • Hash function maps each object to an array index between 0 and m – 1 (in O(1) time) • Hash function makes sorting impossible • Quality of hash function is based on how many elements map to same index in the hashtable • Need to expect O(1) collisions
Hashtables • Dealing with collisions • In counting sort, one array entry contains only element of the same value • The hash function can map different objects to the same index of the hashtable • Chaining • Each entry of the hashtable is a linked list • Linear Probing • If h(x) is taken, try h(x) + 1, h(x) + 2, h(x) + 3, … • Quadratic probing: h(x) + 1, h(x) + 4, h(x) + 9, …
Hashtables • Table Size • If too large, we waste space • If too small, everything collides with each other • Probing falls apart if number of items (n) is almost the size of the hashtable (m) • Typically have a load factor 0 < λ≤ 1 • Resize table when n/m exceeds λ • Resizing changes m • Have to reinsert everything into new hashtable
Hashtables • Table doubling • Double the size every time we exceed our load factor • Worst case is when we just doubled the hashtable • Consider all prior times we doubled the table • n + n/2 + n/4 + n/8 + … < 2n • Insert n items in O(n) time • Average O(1) time to insert one item • Some operations take O(n) time • This also works for growing an ArrayList
Hashtables • Java, hashCode(), and equals() • hashCode() assigns an object an integer value • Java maps this integer to anumber between 0 and m – 1 • If x.equals(y), x and y should have the same hashCode() • Insert objectwith one hashCode() • Won’t find it if you look it up with a different hashCode() • If you override equals(), you must also override hashCode() • Different objects can have the same hashCode() • If this happens too often, we have too many collisions • Only equals() can determine if they are equal
Searching • Unsorted lists • Element could be anywhere, O(n) search time • Sorted lists – binary search • Try middle element • Search left half if too large, right half if too small • Each step cuts search space in half, O(log n) steps • Binary search requires random access • Searching a sorted linked list is still O(n)
Sorting • Many ways to sort • Same high-level goal, same result • What’s the difference? • Algorithm, data structures used dictates running time • Also dictate space usages • Each algorithm has its own flavor • Once again, assume random access to the list
Sorting • Swap operation: swap(x, i, j) • temp = x[i] • x[i] = x[j] • x[j] = temp • Many sorts are a fancy set of swap instructions • Modifies the array in place, very space-efficient • Not space efficient to copy a large array