540 likes | 548 Views
This review covers topics such as typing, induction and recursion, asymptotic complexity, data structures, abstract data types, searching and sorting, and graphs.
E N D
Topics • Today, we will cover • Typing • Induction and Recursion • Asymptotic Complexity • Data Structures • Abstract Data Types and Implementing ADTs • Searching and Sorting • Graphs • For GUIs, you are fine if you can do the practice problems (just do it!)
Topics • Do not worry about • Threads and concurrency • Recurrences • Java virtual machine • How to balance trees (AVL trees) • But do know the difference between a balanced and unbalanced tree • Software engineering (sort of) • Don’t break every known rule of software engineering when asked to write code • We may use a design pattern on the final, but you won’t have to memorize them
Typing • Primitive Types • boolean, int, double, etc… • Test equality with == and != • Compare with <, <=, >, and >=
Pass by Value void f(int x) { x--; } int x = 10; f(x); // x == 10 f 10 f 9 main 10
Typing • Reference types • Actual object is stored elsewhere • Variable contains a reference to the object • == tests equality for the reference • equals() tests equality for the object • Two different references (!=) may exist to two objects with the same value (equals()) • Can compare objects of type T with compareTo() if the Comparable<T> interface is implemented
Pass by Reference {} void f(ArrayList<Integer> l) { l.add(2); l = new ArrayList<Integer>(); } ArrayList<Integer> l = new ArrayList<Integer >(); l.add(1); f(l); // l contains 1, 2 f l main l {} {1} {1,2}
Typing • We know that type B can implement/extend A • B is a subtype of A; A is a supertype of B • The real type of the object is its dynamic type • This type is known only at run-time • Any object can act like the supertype of its dynamic type • But it cannot act like a subtype of its dynamic type • Variables and function arguments of type A can also accept any subtype of A • Type A is a supertype of the dynamic type
Typing • The static type is the type your object has in the code when it is compiled • doesn't make sense-objects don't have static type, expressions do • Dynamic type might be a subtype of the static type • Casting can only change the static type • casting changes neither the static type of an expression nor the dynamic type of an object • Upcasts are always safe • Always cast to a supertype of the dynamic type • Downcasts may not be safe • Can downcast to a supertype of the dynamic type • Can downcast to the dynamic type itself • Cannot downcast to a subtype of the dynamic type
Typing • If B extends A, and B and A both have function foo, which foo gets called? • Answer depends on the dynamic type • If the dynamic type is B, B’s foo will even be called if foo is invoked inside a function of A • Exception: static functions • Static functions are not associated with any object • Thus, they do not have any type
Induction and Recursion • Recursion • Basic examples • Factorial : n! = n(n-1)! • Combinations Pascal’s triangle • Recursive structure • Tree (tree t = root with right/left subtree) • Depth first search • Don’t forget base case (in proof, in your code)
Induction and Recursion • Induction • Can do induction on previous recursive problems • Algorithm correctness proof (DFS) • Math equation proof • Prelim 2 questions
Induction and Recursion • Step 1 • Base case • Step 2 • suppose n is the variable you’re going to do induction on. Suppose the equation holds when n=k • Strong induction: suppose it holds for all n<=k • Step 3 • prove that when n = k+1, equation still holds, by making use of the assumptions in Step 2.
Asymptotic Complexity • f(n) is O(g(n)) if ∃ (c, n0) such that ∀n≥ n0, f(n)≤c⋅g(n) • ∃ - there exists; ∀ - for all • (c, n0) is called the witness pair • Once you have a correct witness pair, you can probably use induction to prove it is correct • f(n) is O(g(n)) means that the function f(n) is roughly less than or equal to g(n) • Big-O notation is a model for running time • Models usually but do not always work in real life
Asymptotic Complexity • Meaning of n0 • We can compare one integer to another • How can we tell if one function is less than or equal to another? • Answer is which function grows faster • One function could also start ahead of the other and grow at the same rate, staying ahead • 60-mph car with no headstart will eventually overtake a 40-mph with a headstart • At what time does the faster car/function take over? • n0
Asymptotic Complexity • Meaning of c • Suppose we cannot get a precise integer value • 897 is less than or equal to 899, but maybe due to some errors the real numbers were 892 and 884 • E.g.: ballot counts in Minnesota recount • Idea: Compare order of magnitude • Compare numbers by the number of digits • 897 and 899 have the same number of digits • Difference between 42 and 482 is far bigger • Gives us some room for error
Asymptotic Complexity • Meaning of c • What is the difference between n3+1 and n3? • What about n3, n3+2n2, and 2n3? • We can be off by a constant factor, c • If f(n) is only twice as fast as g(n), setting c to 2 or greater makes g(n) run faster • Constant factor cannot account for difference between n and n2, log n and n, 2n and 3n • There are three common types of growth • Logarithmic, polynomial, and exponential growth
Data Structures • Linked Lists • Singly-linked/doubly-linked • Sorted/unsorted • Add, delete elements • Arrays • Sorted/unsorted • Add, delete elements • Search Tree • Balanced and unbalanced • Search for an element in array/list/tree • sorted arrays and balanced search trees O(log n) • linked lists (sorted/unsorted) O(n) • other unsorted/unbalanced structures O(n)
Data Structures • Trees • Traversal • Search • Similar to binary search in an array O(log n) • Heap • Min/max heap : heap order invariant • Every node smaller/larger than its immediate children • Add an element (see lecture notes) O(log n) • Delete an element (see lecture notes) O(log n) • Implemented with either a binary tree of array
Hashtables • Motivation • Sort n numbers between 0 and 2n – 1 • Instead of sorting abstract comparable objects, we are sorting integers within a certain range • General lower bound of O(n log n) may not apply • Can be done in O(n) time with counting sort • Create an array of size 2n • The ith entry counts all the numbers equal to i • For each number, increment the correct entry • Can also find a given number in O(1) time
Hashtables • Can not do this with arbitrary data types • The integer type alone can have over 4 billion possible values; no array should be that big • For a hashtable, create an array of size m • Hash function maps each object to an array index between 0 and m – 1 (in O(1) time) • Hash function makes sorting impossible, but still can lookup an element in O(1) time • Quality of hash function is based on how many elements map to same index in the hashtable • Need to expect O(1) collisions
Hashtables • Dealing with collisions • In counting sort, one array entry contains only element of the same value • The hash function can map different objects to the same index of the hashtable • Chaining • Each entry of the hashtable is a linked list • Linear Probing • If h(x) is taken, try h(x) + 1, h(x) + 2, h(x) + 3, … • Quadratic probing: h(x) + 1, h(x) + 4, h(x) + 9, …
Hashtables • Table Size • If too large, we waste space • If too small, everything collides with each other • Probing falls apart if number of items (n) is almost the size of the hashtable (m) • Typically have a load factor 0 < λ≤ 1 • Resize table when n/m exceeds λ • Resizing changes m; we have to reinsert everything with a new hash function
Hashtables • Table Size • What if we double the size every time we exceed our load factor? • Must double the number of items to exceed the load factor again • Worst case is when we just doubled the hashtable • Consider all prior times we doubled the table • n + n/2 + n/4 + n/8 + … < 2n • With table doubling, we can insert n items in O(n) time on average • Some operations take O(n) time • This also works for growing an ArrayList
Hashtables • Java, hashcode() and equals() • Java uses hashcode() in its hash function • hashcode() assigns each item an integer value • Java has a special formula to map this integer to some number between 0 and m – 1 • If one object equals() another, they should have the same hashcode() • Cannot insert an object with one hashcode() and then look the same object up with a different hashcode() • If you override equals(), you must also override hashcode() to preserve this property
Hashtables • Java, hashcode() and equals() • Different objects can have the same hashcode() • If this happens too often, we have too many collisions • Only equals() can determine if they are equal
Abstract Data Type • Lists • Stacks • LIFO • Queues • FIFO • Sets • Dictionaries (Maps) • Priority Queues • Java API • E.g.: ArrayList is an ADT list backed by an array
Abstract Data Type • Priority Queue • Implement as List (sorted/unsorted) : O(n) • Implement as heap • PeekMin look at heap root : O(1) • ExtractMin heap “delete” op : O(log n) • Insert heap “add” op : O(log n)
Sorting (see lecture notes) • Insertion Sort • Selection Sort • Merge Sort • Quick Sort • Heap Sort • Best/worse case • Average case for quicksort • Asymptotic complexity
Miscellaneous Concepts • Inheritance/Interfaces • Abstract classes • Meaning of static
What is a graph? • A graph has vertices • A graph has edges between two vertices • n – number of vertices; m – number of edges • Directed vs. undirected graph • Directed edges can only be traversed one way • Undirected edges can be traversed both way • Weighted vs. unweighted graph • Edges could have weights/costs assigned to them
What is a graph? • What makes a graph special? • Cycles!!! • What is a graph without a cycle? • Undirected graphs • Trees • Directed graphs • Directed acyclic graph (DAG)
Topological Sort • Topological sort is for directed graphs • Indegree: number of edges entering a vertex • Outdegree: number of edges leaving a vertex • Topological sort algorithm • Delete a vertex with an indegree of 0 • Delete its outgoing edges, too • Repeat until no vertices have an indegree of 0
Topological Sort B A B C E D A E C D
Topological Sort • What is the only thing a topological sort cannot delete? • Cycles!!! • If a graph is a DAG, a topological sort will delete the entire graph • If a topological sort deletes the entire graph, the graph is a DAG
Graph Searching • Works on directed and undirected graphs • You have a start vertex which you visit first • You want to visit all vertices reachable from the start vertex • For directed graphs, depending on your start vertex, some vertices may not be reachable • You can traverse an edge from an already visited vertex to another vertex
Graph Searching • Why is choosing any path on a graph risky? • Cycles!!! • Could traverse a cycle forever • Need to keep track of vertices already visited • No cycles if you do not visit a vertex twice • Might also help to keep track of all unvisited vertices you can visit from a visited vertex
Graph Searching • Add the start vertex to the collection of vertices to visit • Pick a vertex from the collection to visit • If you have already visited it, do nothing • If you have not visited it: • Visit that vertex • Follow its edges to neighboring vertices • Add unvisited neighboring vertices to the set to visit • (You may add the same unvisited vertex twice) • Repeat until there are no more vertices to visit
Graph Searching • Running time analysis • Visit each vertex only once • When you visit a vertex, you traverse its edges • You traverse all edges once on a directed graph • Twice on an undirected graph • At worst, you add a new vertex to the collection to visit for each edge (collection has size of O(m)) • Lower bound is O(n + m) • Actual results depends on cost to add/delete vertices to/from the collection of vertices to visit
Graph Searching • Depth-first search and breadth-first search are two graph searching algorithms • DFS pushes vertices to visit onto a stack • Examines a vertex by popping it off the stack • BFS uses a queue instead • Both have O(n + m) running time • Push/enqueue and pop/dequeue have O(1) time
Graph Searching: DFS B A B E D C A E C B-E E-D ∅-A A-B B-C D
Graph Searching: BFS B A B C E D E-D A E C C-D B-E ∅-A B-C A-B D
Minimum Spanning Tree • MSTs apply to undirected graphs • Take only some of the edges in the graph • Spanning – all vertices connected together • Tree – no cycles connected • For all spanning trees, m = n – 1 • All unweighted spanning trees are MSTs • Need to find MST for a weighted graph
Minimum Spanning Tree • A connected component has a path between all vertices in that component. • Idea: find two unconnected components; connect them • Pick the smallest edge between two unconnected components • This is a greedy strategy, but it somehow works
Minimum Spanning Trees • Start with a graph with no edges • n connected components, n trees • Add edges between unconnected components • Forms a bigger tree • What if you add an edge between two vertices in the same component? • Cycles!!!
Minimum Spanning Trees • Kruskal’s algorithm • Process edges from least to greatest • Either an edge connects two different components or it connects a component to itself • Add an edge only in the former case • Picks smallest edge between two components • O(m log m) time to sort the edges • Also need the union-find structure to keep track of components, but it does not change the running time
Minimum Spanning Trees B E 7 5 9 8 A D G 12 4 3 2 C F 1 10
Minimum Spanning Trees • Prim’s algorithm • Graph search algorithm, builds up a spanning tree from one root vertex • Like BFS, but it uses a priority queue • Priority is the weight of the edge to the vertex • Also need to keep track of which edge we used • Always picks smallest edge to an unvisited vertex • Size of heap is O(m); running time is O(m log m)
Minimum Spanning Trees B E 7 5 9 8 A D G 12 4 3 2 C F 1 10 C-D 3 A-C 2 ∅-A 0 C-B 4 A-B 5 B-E 7 D-E 8 E-G 9 G-F 1 D-F 10 D-G 12