70 likes | 166 Views
Hashes. a “hash” is another fundamental data structure, like scalars and arrays. Hashes are sometimes called “associative arrays”. Basically, a hash associates a key with a value. A hash is composed of a set of key-value pairs.
E N D
Hashes • a “hash” is another fundamental data structure, like scalars and arrays. • Hashes are sometimes called “associative arrays”. • Basically, a hash associates a key with a value. A hash is composed of a set of key-value pairs. • A key is a string: any collection of characters, generally enclosed in quotes. Any scalar can be a key, but they are all converted to strings. • A value can be almost anything: the values are just scalar variables. • One hash oddity: neither the keys nor the values is sorted or stored in a useful order. The order you enter hash items is not related to the order with which you retrieve them.
Why Use Hashes? • The C language doesn’t have anything like a hash in it, and clearly C can do just about anything you need to do in programming. • The point of Perl is to make your life easier, to include useful tools, even if they are messy and clutter up the language. • Examples of hash usage: --keeping count of the number of times a word is used in a text, or that a particular sub-sequence appears in DNA. Use the word as the key and the number of appearances as the value. --Associating ID numbers with people’s names --Associating protein names with their properties. • And lots more. The hash is a tool that gets used very frequently once you understand them.
Hash Basics • The punctuation mark used to denote a hash is % (percent sign). • Note that hashes, arrays, and scalars are completely different variables. The variables $cat, @cat, and %cat are all different and independent variables. I don’t recommend using the same names for different variables, but it is legal. • Hash elements are accessed by enclosing the key in curly braces. For example, the hash %stoplight is populated as follows: $stoplight{red} = “stop”; $stoplight{yellow} = “caution”; $stoplight{green} = “go”; In this hash, “red”, “yellow”, and “green” are the keys, and “stop”, “caution”, and “go” are the values. • Each key can refer to only a single value. You can’t have duplicate keys. If you try, the first value will be lost and only the second will work: $stoplight{yellow} = “speed up”; print “$stoplight{yellow}\n”; # prints “go faster” • However, different keys can refer to the same value without any problem.
Alternative Way of Loading Hashes • A hash really is an array with alternating keys and values. You could load a hash by simply writing the keys and values the same way as you would load an array: %stoplight = (“red”, “stop”, “yellow”, “caution”, “green”, “go”); • This method is a bit annoying, because it is easy to lose track of keys and values. A better way is to use the => operator (“big arrow”), which is really just a synonym for a comma: %stoplight = ( “red” => “stop”, “yellow” => “caution”, “green” => “go” ); Note the use of newlines here--makes reading the code easier.
Hash Operators • “keys” gives a list of all the keys used in the hash. Here’s a common use: foreach $key (keys %stoplight) { print “$key stands for $stoplight{$key}\n”; } • Note that the keys are not returned in a useful order. If you want them sorted you could write: foreach (sort keys %stoplight) { or foreach (sort {$a <=> $b} keys %stoplight) { • Similarly, “values” gives a list of all the values, in some unusual order. • “each” is an operator that returns a 2 element list consisting of the key and the value. It needs to do this in tandem with “while”: while ( ($key, $value) = each %stoplight) { print “$key : $value\n”; }
More Hash Operators • Removing elements in a hash is done with “delete”: delete $stoplight{“red”}; # both key and value are removed • Testing for existence with “exists”: exists $stoplight{“red”) returns true if that key-value pair exists, and “false” if it doesn’t.
Counting • Here’s a simple use of hashes to get word counts. Input is a file of words, one per line. while ($word = <STDIN>) { $chomp $word; $word_hash{$word}++; } • Note that %word_hash was created implicitly, without ever being explicitly declared. This is a standard Perl feature, but I will shortly discourage its use. • Also, a nice feature of using hashes for counting is that each value is automatically set to 1 the first time a new key is encountered. That is, you don’t have to initialize each key-value pair; Perl does it for you automatically.