210 likes | 323 Views
COSC 1030 Lecture 10. Hash Table. Topics. Table Hash Concept Hash Function Resolve collision Complexity Analysis. Table. Table A collection of entries Entry :<key, info> Insert, search and delete Update, and retrieve Array representation Indexed Maps key to index. Hash Table.
E N D
COSC 1030 Lecture 10 Hash Table
Topics • Table • Hash Concept • Hash Function • Resolve collision • Complexity Analysis
Table • Table • A collection of entries • Entry :<key, info> • Insert, search and delete • Update, and retrieve • Array representation • Indexed • Maps key to index
Hash Table • Hash Table • A table • Key range >> table size • Many-to-one mapping (hashing) • Indexed – hash code as index • Tabbed Address Book • Map names to A:Z • Multiple names start with same letter • Same tab, sequential slots
Hash Table ADT Interface Hashtable { void insert(Item anItem); Item search(Key aKey); boolean remove(Key aKey); boolean isFull(); boolean isEmpty(); }
Hash Function • Maps key to index evenly • For any n in N, hash(n) = n mod Mwhere M is the size of hash table. • hash(k*M + n) = n, where n < M, k: integer • Map to integer first if key is not an integer • A:Z 0:25 String s h(s[0]) + h(s[1])*26 +…+ h(s[n-1])*26^(n-1) String s h(s[0])*26^(n-1) + …+h(s[n-1])
Hash Function String s h(s[0])*26^(n-1) + …+h(s[n-1]) int toInt(String s) { assert(s != null); int c = 0; for (int I = 0; I < s.length(); I ++) { c = c*26 + toInt(s.charAt(I)); } return c; } int hash(String s) { return hash(toInt(s)); }
Example • Table[7] – HASHTABLE_SIZE = 7 • Insert ‘B2’, ‘H7’, ‘M12’, ‘D4’, ‘Z26’ into the table 2, 0, 5, 4, 5 • Collision • The slot indexed by hash code is already occupied • A simple solution • Sequentially decreases index until find an empty slot or table is full
Collision Possibility • How often collision may occur? • Insert 100 random number into a table of 200 slots • 1 – ((200 – I)/200), I=0:99 = 1 – 6.66E-14 > 0.99999999999993 • Load factor • 100/200 = 0.5 = 50% 0.99999999999993 • 20/ 200 = 0.1 = 10% 0.63 • 10/200 = 0.05 = 5% 0.2 • Default load factor is 75% in java Hashtable
Primary Cluster • The biggest solid block in hash table • Join clusters • The bigger the primary cluster is, the easier to grow • Distributed evenly to avoid primary cluster
Probe Method • What we can do when collision occurred? • A consistent way of searching for an empty slot • Probe • Linear probe – decrease index by 1, wrap up when 0 • Double hash – use quotient to calculate decrement • Max(1, (Key / M) % M) • Separate chaining – linked list to store collision items • Hash tree – link to another hash table (A4)
Probe sequence coverage • Ensure probe sequence cover all table • Utilizes the whole table • Even distribution • M and probe decrement are relative prime • No common factor except 1 • Makes M a prime number • M and any decrement (< M) are relative prime
Probe Method void insert(Item item) { if(!isFull()) { int index = probe(item.key); assert(index >=0 && index < M); table[index] = item; count ++; } }
Linear Probe Method int probe(int key) { int hashcode = key % HASHTABLE_SIZE; if(table[hashcode] == null) { return hashcode; } else { int index = hashcode; do { index--; if(index < 0) index += HASHTABLE_SIZE; } while (index != hashcode && table[index] != null); if(index == hashcode) return –1; else return index; } }
Double Hash Probe Method int probe(int key) { int hashcode = key % HASHTABLE_SIZE; if(table[hashcode] == null) { return hashcode; } else { int index = hashcode; int dec = (key / HASHTABLE_SIZE) % HASHTABLE_SIZE; dec = Math.max(1, dec); do { index -= dec; if(index < 0) index += HASHTABLE_SIZE; } while (index != hashcode && table[index] != null); if(index == hashcode) return –1; else return index; } }
Search Method Item search(int key) { int hashcode = key % HASHTABLE_SIZE; int dec = max(1, (key / HASHTABLE_SIZE) % HASHTABLE_SIZE); while(table[hashcode] != null) { if(table[hashcode].key == key) break; hashcode -= dec; } return table[hashcode]; }
Delete Method • Difficulty with delete when open addressing • Destroy hash probe chain • Solution • Set a deleted flag • Search takes it as occupied • Insert takes it as deleted • Forms primary cluster • Separate chaining • Move one up from chained structure
Efficiency • Successful search • Best case – first hit, one comparison • Average • Half of average length of probe sequence • Load factor dependent • O(1) if load factor < 0.5 • Worst case – longest probe sequence • Load factor dependent • Unsuccessful search • Average - average length of probe sequence • Worst case - longest probe sequence
Advanced Topics • Choosing Hash Functions • Generate hash code randomly and uniformly • Use all bits of the key • Assume K=b0b1b2b3 • Division • h(k) = k % M; p(k) = max (1, (k / M) % M) • Folding • h(k) = b1^b3 % M; p(k) = b0^b2 % M; // XOR • Middle squaring • h(k) = (b1b2) ^ 2 • Truncating • h(k) = b3;
Advanced Topics • Hash Tree • Separate chained collision resolution • Recursively hashing the key Hash Table Hash Table Hash Table Hash Table Hash Table Hash Table
Hash Tree void insert(int key, Item item) { Int h = h(key); Int k = g(key); // one-to-one mapping Key Key If(table[h] == null) { table[h] = item; } else { if(table[h].link == null) table[h].link = new HashTree(); table[h].link.insert(k, item); } }