80 likes | 226 Views
Hash Tables: A basic O(1) verview. Niteesh Prasad CS-265. Hash Tables in C. Data structure used for mapping/ searching. Increased efficiency to $O(1)$- Constant Look up time. Disadvantage: Wastage of memory
E N D
Hash Tables: A basic O(1)verview Niteesh Prasad CS-265
Hash Tables in C • Data structure used for mapping/ searching. • Increased efficiency to $O(1)$- Constant Look up time. • Disadvantage: Wastage of memory • Used in database indexing and associate arrays. Commonly used applications like web-browsers use hash tables. • Creates a table with operations such as Insert and Retrieve(look up).
Basic Idea • Store records of data in the hash table, which is either an array (when main memory is used), or a file (if disk space is used). • Each record has a key field and an associated data field. The record is stored in a location that is based on its key. • The function that produces this location for each given key is called a hash function
More Background • Each element of the array is a list that chains together items that share a hash value. • Basic element type would look like this: • typedef structNameval Nameval; • struct Nameval { • char *name; • int value; • Nameval *next; /* in chain */ • }; • Hash function is important in deciding the efficiency of the search.
It’s not a “perfect” world out there • Finding the perfect hash function is difficult (a function that maps each input to a different has value) • Consequences: Conflicts b/w keys (Hash Collision) • Collision resolution :(1) Resolution by overflow (Creating a separate table) • (2) Double Hashing (Two independent hash functions) • (3) Rehashing (Rebuilding the entire table ) • The Multiplication method is the most-widely used to find the indices. • (1) Multiply the key by a constant A, 0 < A < 1 • (2) Extract the fractional part of the product • (3) Multiply this value by m
Hash Function Hash function /* hash: compute hash value for array of NPREF strings */ unsigned int hash(char *s[NPREF]) { unsigned int h; unsigned char *p; int i; h = 0; for (i = 0; i < NPREF; i++) // traversal through the array for (p = (unsigned char *) s[i]; *p != '\0'; p++) h = MULTIPLIER * h + *p; return h % NHASH; }
Insert and Look up /* lookup: find name in symtab, with optional create */ Nameval* lookup(char *name, int create, int value) { int h; Nameval *sym; h = hash(name); for (sym = symtab[h]; sym != NULL; sym = sym->next) if (strcmp(name, sym->name) == 0) return sym; // To avoid duplication if (create) { sym = (Nameval *) emalloc(sizeof(Nameval)); sym->name=name; /* assumed allocated elsewhere */ sym->value=value; sym->next=symtab[h]; symtab[h]=sym; } return sym; }
Sources http://eternallyconfuzzled.com/tuts/datastructures/jsw_tut_hashtable.aspx http://www.cs.drexel.edu/~knowak/cs265_fall_2010/week_6.pdf Brian Kernighan and Rob Pike, The Practice of Programming, Addison Wesley, 1999