110 likes | 228 Views
The Hash Table Data Structure. Mugurel Ionu ț Andreica Spring 2012. Operations. put(key, value) Inserts the pair (key, value) in the hash table If a pair (key, value’) (with the same key) already exists, then value’ is replaced by value
E N D
The Hash Table Data Structure Mugurel Ionuț Andreica Spring 2012
Operations • put(key, value) • Inserts the pair (key, value) in the hash table • If a pair (key, value’) (with the same key) already exists, then value’ is replaced by value • We say that the value value is associated to the key key • get(key) • Returns the value associated to the key key • If no value is associated to key, then an error occurs • hasKey(key) • Returns 1 if the key key exists in the hash table, and 0 otherwise
Example • put(3, 7.9) • put(2, 8.3) • get(3) -> returns 7.9 • put(3, 10.2) • get(3) -> returns 10.2 • get(2) -> returns 8.3 • hasKey(5) -> returns 0 • hasKey(2) -> returns 1 • get(5) -> generates an error
Possible implementation • Maintain an array H[HMAX] of linked lists • The info field of each element of a list consists of a struct containing a key and a value • Each key is mapped to a value hkey=hash(key), such that 0≤hkey≤HMAX-1 • hash(key) is called the hash function • put(k, v) • Searches for the key k in the list H[hkey=hash(k)] • If the key is found, then we replace the value by v • If the key is not found, then we insert the pair (k,v) in H[hkey] • get(k) • Search for the key k in H[hkey=hash(k)] • If it finds the key, then it returns its associated value; otherwise, an error occurs • hasKey(k) • Search for the key k in H[hkey=hash(k)] • If it finds the key, then it returns 1; otherwise, it returns 0
Possible implementation (cont.) • Class Hashtable • HMAX and hash => arguments in the constructor • The function hash will be passed as an argument (actually, a pointer to the function will be passed in fact) • Obviously, hash must be defined differently according to the data type of the keys (see later some examples for int and char*) • The array H: allocated dynamically in the constructor & deallocated in the destructor
Hash Table – Implementation (hash_table.h) void put(Tkey key, Tvalue value) { struct list_elem<struct elem_info<Tkey, Tvalue> > *p; struct elem_info<Tkey, Tvalue> info; int hkey = hash(key); p = H[hkey].pfirst; while (p != NULL) { /* the == operator must be meaningful when comparing values of the type Tkey ; otherwise, an equality testing function should be passed as an argument to the constructor */ if (p->info.key == key) break; p = p->next; } if (p != NULL) p->info.value = value; else { info.key = key; info.value = value; H[hkey].addLast(info); } } #include "linked_list.h" template<typename Tkey, typename Tvalue> struct elem_info { Tkey key; Tvalue value; }; template<typename Tkey, typename Tvalue> class Hashtable { private: LinkedList<struct elem_info<Tkey, Tvalue> > *H; int HMAX; int (*hash) (Tkey); public: Hashtable(int hmax, int (*h) (Tkey)) { HMAX = hmax; hash = h; H = new LinkedList<struct elem_info<Tkey, Tvalue> > [HMAX]; } ~Hashtable() { for (int i = 0; i < HMAX; i++) { while (!H[i].isEmpty()) H[i].removeFirst(); } delete H; }
Hash Table – Implementation (hash_table.h) (cont.) Tvalue get(Tkey key) { struct list_elem<struct elem_info<Tkey, Tvalue> > *p; int hkey = hash(key); p = H[hkey].pfirst; while (p != NULL) { if (p->info.key == key) break; p = p->next; } if (p != NULL) return p->info.value; else { fprintf(stderr, "Error 101 - The key does not exist in the hashtable\n"); Tvalue x; return x; } } int hasKey(Tkey key) { struct list_elem<struct elem_info<Tkey, Tvalue> > *p; int hkey = hash(key); p = H[hkey].pfirst; while (p != NULL) { if (p->info.key == key) break; p = p->next; } if (p != NULL) return 1; else return 0; } };
Using the Hash Table - example char *k3 = "Abc"; char *k4 = "abcD"; int main() { hid.put(3, 7.9); hid.put(2, 8.3); printf("%.3lf\n", hid.get(3)); hid.put(3, 10.2); printf("%.3lf\n", hid.get(3)); printf("%.3lf\n", hid.get(2)); printf("%d\n", hid.hasKey(5)); printf("%d\n", hid.hasKey(2)); printf("%.3lf\n", hid.get(5)); hci.put(k1, 10); hci.put(k2, 20); printf("%d\n", hci.get(k1)); hci.put(k1, 30); printf("%d\n", hci.get(k1)); printf("%d\n", hci.get(k2)); printf("%d\n", hci.hasKey(k3)); printf("%d\n", hci.hasKey(k2)); printf("%d\n", hci.get(k4)); char *k5 = new char[4]; k5[0] = ‘a’; k5[1] = ‘b’; k5[2] = ‘c’; k5[3] = 0; printf("%d\n", hci.get(k5)); // what happens ? return 0; } #include <stdio.h> #include <string.h> #include “hash_table.h” #define VMAX 17 #define P 13 int hfunc(int key) { return (P * key) % VMAX; } Hashtable<int, double> hid(VMAX, hfunc); int hfunc2(char* key) { int hkey = 0; for (int i = 0; i < strlen(key); i++) hkey = (hkey * P + key[i]) % VMAX; return hkey; } Hashtable<char*, int> hci(VMAX, hfunc2); char *k1 = "abc"; char *k2 = "xyze";
Final remarks • The Hash table is a fundamental data structure in many situations • Packet routing in the Internet • Session management in web servers • Web search (e.g. Google) • etc. • Many other implementations exist, besides using an array of linked lists • For example: Linear probing, Cuckoo hashing, etc. • Many of them are more efficient, but more difficult to understand