1 / 21

Symbol Tables and Hashing: Review and Implementation Details

Learn about symbol tables, hash table implementations, hashing functions, separate chaining vs. linear probing, and more in data structures and algorithms.

balvarado
Download Presentation

Symbol Tables and Hashing: Review and Implementation Details

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Structures and Algorithms I Day 19, 11/8/11 Review Chapters 3 and 4 CMP 338

  2. Symbol Table • A symbol table is a mapping from Key's to Value's • Conventions: • (At most) one Value per Key • null is not a legal Key • null is not a legal Value • Key's should be immutable • Shorthand implementations: • delete(Key k) { put(k, null); } • contains(Key k) { return get(k) != null; } • IsEmpty() { return size() == 0 }

  3. Symbol Table API • public class SymbolTable<Key, Value> • SymbolTable() • void put(Key k, Value v) • Value get(Key k) • void delete(Key k) • boolean contains(Key k) • boolean isEmpty() • int size() • Iterable<Key> keys()

  4. Sequential Search Implementation • public Value get(Key k) • for (Node n=first; n!=null; n=next) • if (k.equals(n.key)) • return n.val; • return null; • public void put(Key k, Value v) • for (Node n=first; n!=null; n=next) • if (k.equals(n.key)) • n.val = v; return • first = new Node(k, v, first)

  5. Hash Table Implementations • Hash function: hash() • Maps Key's to small int's • Java hashCode() maps Object's to 32-bit ints • Collision resolution: • Strategy for handling two Key's mapped to the same int • Closed-addressing (e.g., separate chaining) • Array entries point to secondary symbol table • Open-addressing (e.g. linear probing) • All Key-Value pairs stored in the same array

  6. Hash Functions • Uniform hashing assumption: • hash: Key → 0..M-1 uniform and independent • Implementing hashCode() for user-defined types • Combine hashCodes of each field (array entry) • Start with a small prime (e.g., 17) • Multiply accummulating hash by small prime (e.g. 31) • Add hashCode() of next field (or array entry) • Box primitive values (e.g., ((Integer) 14),hashCode()) • Requirement: x.equals(y) => x.hashCode()==y.hashCode() • Hash function: hash(Key k) • return k.hashCode() && 0x7FFFFFFF % M;

  7. Separate Chaining Hash Table • SeparateChainingHashTable(int size) • M = size; • for int i=0; i<M; i++ • st[i] = new SequentialSearchST() • public Value get(Key k) • return (Value) st[hash(k)].get(k) • public void put(Key k, Value v) • st[hash(k)].put(k, v) • private int hash(Key k) • return k.hashCode() & 0x7FFFFFFF % M

  8. Linear Probing Hash Table • public Value get(Key k) • for int i=hash(k); null!=key[i]; i=i+1 % M • if (keys[i] == k) return vals[i] • return null • public void put(key k, Value v) • int i • for int i=hash(k); null!=key[i]; i=i+1 % M • if (keys[i] == k) • vals[i] = v; return • keys[i]=k; vals[i]=v

  9. Separate Chaining vs. Linear Probing • Separate Chaining • Easier to implement • Performance degrades gracefully • Clustering less sensitive to poor hash() • Linear probing • Wastes less space • However, need to implement array resizing • Better cache performance

  10. Hashing vs. Balanced Search Trees • Hashing • Simpler to code • No effective alternative for unordered keys • Faster (assuming efficient hash function) • Better system support for Java Strings • Balanced search trees • Stronger performance guarantees • Support for ordered operations • Easier to implement compareTo correctly • Than equals() and hashCode()

  11. Java Symbol Tables • Map<K, V> Interface • TreeMap<K, V> implements SortedMap<K, V> • O(lg N) order operations (worst-case) • HashMap<K, V> implements Map<K, V> • O(1) put() and get() operations (average-case) • Set<K> Interface • TreeSet<K> implements SortedSet<K> • HashSet<K> implements Set<K>

  12. Graphs (Mathematics) (Directed) Graph <V, E> V is a set of vertices E is a set of edges E  V x V DAG Directed Acyclic Graph Undirected Graph E is symmetric Edge-Weighted Graph weight: E → R

  13. Graph Vocabulary A path is a sequence edges connecting vertices simple path: no vertex appears twice A cycle is a path from a vertex to itself simple cycle: removing final edge leaves a simple path A connected component (undirected graph): A maximal set of connected vertices A strongly connected component (directed graph): A maximal set of vertices such that there is a directed path from any vertex to any other vertex

  14. Depth-First Search Each node visited exactly once. Visit each neighbor during visit to a node void visit (Node n) if (visited(n)) return; mark n visited do stuff for each neighbor m of n visit(m) maybe do more stuff Trace: (1(2(3)(4(5)(6))(7))(8(9)(A(B)(C))))(D(E)(F)) Example: ConnectedComponent.java

  15. Breadth-First Search Each node visited exactly once. Schedule visit to each neighbor during visit to a node void visit (Node n) if (visited(n)) return; mark n visited do stuff for each neighbor m of n put m on queue of nodes to visit maybe do more stuff Trace: (1)(2)(3)(6)(7)(8)(4)(5)(9)(A)(C)(B)(D)(E)(F) Example: ShortestPath.java

  16. Spanning Tree (Undirected Graph) A tree in an undirected graph: A set of connected edges not containing a cycle A spanning tree or an undirected graph: A tree that connects each vertex of the graph A spanning forest of an undirected graph: Set of spanning trees of the connected components A minimum spanning tree (MST) of a weighted graph The spanning tree with minimum total weight

  17. Prim's MST Algorithm mark any node while exists an edge from marked to unmarked pick the shortest such edge add the edge to the MST mark the unmarked vertex Use priority queue to keep track of the edges Optimization: only 1 edge per unmarked node Need to be able to reduce a key in a priority queue Running-time: ||E|| + ||V|| lg ||V||

  18. Kruskal's MST Algorithm while exists an unconsidered edge consider the shortest unconsidered edge if it would not create a cycle add the edge to MST Use priority queue to keep track of the edges Cycle detection: Disjoint Union / Find algorithm Edge will create a cycle iff both end-points in the same set Adding an edge to MST requires union of two sets Running-time: ||E|| lg ||E||

  19. Shortest Path Algorithms Shortest paths in edge-weighted directed graphs Problem ill-formed if any negative cycle is reachable If graph is a DAG Relax nodes in topological order O(||E|| + ||V||) If all edges are non-negative (Dijkstra) Mark and relax nearest unmarked node O(||E||+||V|| lg ||V||) General edge-weighted directed graphs (Bellman-Ford) Repeat up to ||V|| times: Relax nodes changed in previous iteration O(||E|| ||V||)

  20. Relaxation void relax (Node n) for Node m in edgeFrom(n) relax(n, m) void relax (Node n, Node m) If dist(s, n) + w(n, m) < dist(s, m) dist(s, m) = dist(s, n) + w(n, m) parent(m) = n dist is a Map<Node, Double> parent is a Map<Node, Node>

  21. Dijkstra's Shortest Path Algorithm mark the source node while exists an edge from marked to unmarked pick closest unmarked node n to source pick shortest edge from marked to n add edge to Shortest-Path tree mark and relax n Use priority queue to order unmarked nodes Running-time: ||E|| + ||V|| lg ||V||

More Related