140 likes | 159 Views
Learn about hash tables, what a hash is, how to calculate hashes, setup & use hash tables, common issues like collisions & memory management. Explore examples & additional resources.
E N D
Hash Tables in C Steven Hansen
Overview • What is a hash table? • What is a hash? • What are they useful for? • Are they easy to use?
What is a hash table • A hash table is an array that uses calculated “hash” values to perform lookups. • Every hash table is made up of “buckets” that hold the values you wish to store.
What is a hash? • A hash is a value that is calculated using the data or the “key” • The key is whatever data you are using to lookup other data in the table. • E.g. Using a name as a key for a phone number.
How do I calculate a hash? • You use an algorithm that uses the data in the key to create a hash value suitable for your needs. • There are many algorithms you can use, some better than others. • You can use a modulus to keep the hash within range.
Example Algorithms • for (h=0, i=0; i<len; ++i) h += key[i]; • for (h=0, i=0; i<len; ++i) h ^= key[i]; • for (h=0, i=0; i<len; ++i) h *= key[i]; • None of these are very good algorithms. • From http://burtleburtle.net/bob/hash/hashfaq.html
How easy is a hash table to setup and use? • Quite easy, just create an array and hash data into it! • You can treat it just like any other array when you are trying to find data. • To find data you re-calculate the hash and look in that bucket.
What are the problems? • Collisions – When two pieces of data want to occupy the same space. • This is solved trivially, you can create an array of linked lists and add data to the list instead of the bucket. • Or you could just move the data down into an open slot and search for it the same way. • If a hashing algorithm is slow or creates too many collisions the efficiency is degraded.
What makes C different? • Memory management. • To create a dynamically allocated object you must use malloc which returns a void pointer to the object. You then have to cast it as the correct pointer type • e.g. class *ex = (* class)malloc(sizeof(class)) • This creates the object ex of type class. • To clear the memory you would use free(ex)
Example of a basic hash • Data: [1, 2, 3, … , 40] • Algorithm: hash = i % 10; • Hash Table = int table[10];
Perfect Hash • A perfect hash is an algorithm that has no collisions. • These are usually found when a hashing algorithm is created for known data.
Cryptographic Hashing • MD4, MD5, SHA, Snefru, RIPE-MD • Secure hashes with many uses • Password verification • Used to uniquely identify files
Resources • http://burtleburtle.net/bob/hash/hashfaq.html • Excellent resource for hashing algorithms • Section 2.9 of the textbook • Explains the basics of hashing and some collision resolution • http://en.wikipedia.org/wiki/Hash_table • Good start for understanding hash tables